Projects

DELFT
WWW 2020

Factoid QA system combining knowledge graph and free-text reasoning via graph neural networks over Wikipedia. Outperforms BERT-based baselines on entity-rich questions.

System Dataset
CANARD
EMNLP 2019

40,527 question-rewriting pairs for conversational QA, testing coreference resolution and ellipsis in multi-turn dialog. Built on the QuAC dataset. Licensed CC BY-SA 4.0.

Dataset

Human-in-the-loop generation of adversarial quiz bowl questions that stump computers but remain answerable by expert humans. Includes a live writing interface and ~1,000 question dataset.

Dataset Interface
QBLink
EMNLP 2018

Sequential open-domain QA dataset with 18,644 multi-step sequences (56,000 Q&A pairs) built from quiz bowl tossups. Tests contextual reasoning across chains of related questions.

Dataset