Projects
DELFT
WWW 2020Factoid QA system combining knowledge graph and free-text reasoning via graph neural networks over Wikipedia. Outperforms BERT-based baselines on entity-rich questions.
System DatasetCANARD
EMNLP 201940,527 question-rewriting pairs for conversational QA, testing coreference resolution and ellipsis in multi-turn dialog. Built on the QuAC dataset. Licensed CC BY-SA 4.0.
DatasetAdversarial Question Writing
TACL 2019Human-in-the-loop generation of adversarial quiz bowl questions that stump computers but remain answerable by expert humans. Includes a live writing interface and ~1,000 question dataset.
Dataset InterfaceQBLink
EMNLP 2018Sequential open-domain QA dataset with 18,644 multi-step sequences (56,000 Q&A pairs) built from quiz bowl tossups. Tests contextual reasoning across chains of related questions.
Dataset