2020
DOI: 10.1162/tacl_a_00317
|View full text |Cite
|
Sign up to set email alerts
|

TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages

Abstract: Confidently making progress on multilingual modeling requires challenging, trustworthy evaluations. We present TyDi QA—a question answering dataset covering 11 typologically diverse languages with 204K question-answer pairs. The languages of TyDi QA are diverse with regard to their typology—the set of linguistic features each language expresses—such that we expect models performing well on this set to generalize across a large number of the world’s languages. We present a quantitative analysis of the data qual… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
314
0
1

Year Published

2020
2020
2021
2021

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 290 publications
(318 citation statements)
references
References 60 publications
3
314
0
1
Order By: Relevance
“…XNLI was created by translating examples from the English MultiNLI data set, and projecting its sentence labels (Williams, Nangia, and Bowman 2018). Other recent multilingual data sets target the task of question answering based on reading comprehension: i) MLQA (Lewis et al 2019) includes 7 languages; ii) XQuAD (Artetxe, Ruder, and Yogatama 2019) 10 languages; and iii) TyDiQA (Clark et al 2020) 9 widely spoken typologically diverse languages. While MLQA and XQuAD result from the translation from an English data set, TyDiQA was built independently in each language.…”
Section: Previous Work and Evaluation Datamentioning
confidence: 99%
“…XNLI was created by translating examples from the English MultiNLI data set, and projecting its sentence labels (Williams, Nangia, and Bowman 2018). Other recent multilingual data sets target the task of question answering based on reading comprehension: i) MLQA (Lewis et al 2019) includes 7 languages; ii) XQuAD (Artetxe, Ruder, and Yogatama 2019) 10 languages; and iii) TyDiQA (Clark et al 2020) 9 widely spoken typologically diverse languages. While MLQA and XQuAD result from the translation from an English data set, TyDiQA was built independently in each language.…”
Section: Previous Work and Evaluation Datamentioning
confidence: 99%
“…It can also be used to perform QA in current events via the CORD-19 COVID-19 (Wang et al, 2020;Tang et al, 2020) dataset. In the future we plan on experimenting with additional QA datsets such as Natural Questions (Kwiatkowski et al, 2019) and TyDiQA (Clark et al, 2020).…”
Section: Discussionmentioning
confidence: 99%
“…The MLQA dataset contains parallel instances in 7 languages where the context is found in Wikipedia. The TyDiQA (Clark et al, 2020) dataset containes instances in 11 languages. However, TyDiQA is not parallel and it only has instances where the question and context are in the same language.…”
Section: Related Workmentioning
confidence: 99%
“…The English portion includes instructions by speakers in the USA (en-US) and India (en-IN). Unlike Chen and Mooney (2011) and like the TyDi-QA multilingual question answering dataset (Clark et al, 2020), RxR's instructions are not translations: all instructions are created from scratch by native speakers. This especially matters for VLN, as different languages encode spatial and temporal information in idiosyncratic ways-e.g., how contact/support relationships are expressed (Munnich et al, 2001), frame of reference (Haun et al, 2011), and how temporal accounts are expressed (Bender and Beller, 2014).…”
Section: Motivationmentioning
confidence: 99%