Efficient Multimodal Question Answering (EMM-QA)

EMM-QA logo

Efficient multimodal question answering in the era of large language models.

EMM-QA is an ICML 2026 workshop focused on question answering systems that must balance accuracy, efficiency, and adaptability across multiple input modalities. The workshop brings together researchers from academia and industry working on knowledge-intensive multimodal systems that operate under practical resource constraints.

Rather than focusing only on larger models, the workshop emphasizes methods that make multimodal question answering usable in real settings, including retrieval-augmented systems, compact models, efficient inference, and human-in-the-loop evaluation.

Join the community on Discord.

Call for Papers

Computer Teams

Scope

The workshop is centered on efficient multimodal question answering. It also welcomes closely related work on multimodal retrieval, reasoning, evaluation, benchmarking, and efficient inference when those contributions are clearly connected to question answering or other knowledge-intensive multimodal tasks.

Like the previous iteration of EfficientQA, which focused on text-only question answering, we will also host a human-computer question answering competition. If you’d like to take part in that part of the competition (it should be fun!), you can either play as a team or write questions.

Workshop Format

The workshop is planned as a one-day event combining:

  • Contributed papers
  • Poster presentations
  • Invited keynotes
  • Shared-task highlights
  • A live human-computer question answering event
  • A panel discussion

The workshop will also serve as the venue where we announce the winning systems from the QANTA 2026 computer competition.

Schedule

  • Workshop takes place on July 11th.
  • All talk sessions (invited talks, spotlights, challenge talks, awards, etc.) will take place in the main workshop room at COEX.
  • All poster sessions will take place separately in Hall A outside the workshop room area at COEX.
Time Activity Duration
08:00‑08:10 Welcome & Workshop Overview 10 min
08:10‑08:50 🟦 Robin Jia: TBA 40 min
08:50‑09:00 Q&A 10 min
09:00‑09:15 β˜• Coffee Break 15 min
09:15‑09:55 🟦 Sewon Min: TBA 40 min
09:55‑10:05 Q&A 10 min
10:05‑10:50 🟨 Contributed Paper Spotlights 45 min
10:50‑11:50 🟧 Workshop Posters 60 min
11:50‑12:50 Lunch 60 min
12:50‑13:20 πŸ€– Live AI QA Competition 30 min
13:20‑14:00 🟦 Mrinmaya Sachan: TBA 40 min
14:00‑14:10 Q&A 10 min
14:10‑14:50 🟦 Naman Goyal & Jenny Ni: Multimodal Robustness Under Distribution Shift 40 min
14:50‑15:00 Q&A 10 min
15:00‑15:15 β˜• Coffee Break 15 min
15:15‑15:35 Shared Challenge Introduction & Results Overview 20 min
15:35‑15:55 🟨 Best Challenge Team Talks 20 min
15:55‑16:05 πŸ† Challenge Awards 10 min
16:05‑16:10 Closing Remarks 5 min
16:10‑17:00 🟧 Shared Challenge Posters 50 min

Legend

  • 🟦 Invited Talks
  • 🟨 Contributed Paper Spotlights / Best Challenge Team Talk
  • 🟧 Poster Sessions

Confirmed Keynote Speakers

  • Sewon Min (UC Berkeley EECS & Allen Institute for AI)
  • Mrinmaya Sachan (ETH ZΓΌrich)
  • Robin Jia (University of Southern California)
  • Naman Goyal (Google DeepMind) & Jenny Ni (Google)

Organizers

  • Jordan Boyd-Graber, University of Maryland
  • Martin Fajčík, Brno University of Technology
  • George Jojo Boateng, ETH Zurich / Kwame AI
  • Ikuya Yamada, Studio Ousia / Tohoku University / Nagoya University / RIKEN
  • Chen Zhao, NYU Shanghai

Contact

Questions about the workshop can be sent to emm-qa-organizers@googlegroups.com. Or join the Discord.

Sponsors/Acknowledgements

  • This workshop is partially supported by Horizon EU programme through project ELOQUENCE, grant no. 101135916.