QBLink: Sequential Open-Domain Question Answering
QBLink is a dataset for sequential question answering where multiple related questions about the same topic are answered in sequence. It evaluates how well QA systems leverage context from previous questions and answers.
18,644 sequences · 56,000 question–answer pairs
Dataset Structure
Each sequence contains:
| Field | Description |
|---|---|
id |
Sequence identifier |
tournament |
Quiz bowl tournament source |
lead-in |
Introductory sentence defining the topic |
category |
Subject area (History, Literature, Philosophy, etc.) |
sub-category |
More specific classification |
| Questions 1–3 | Each with question_text, raw_answer, wiki_page |
Example sequences cover topics such as Bitcoin’s inventor or Ronald Reagan’s presidency, where later questions reference earlier answers to test contextual reasoning.
Citation
Ahmed Elgohary, Chen Zhao, Jordan Boyd-Graber. Dataset and Baselines for Sequential Open-Domain Question Answering. EMNLP 2018.