Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.597
|View full text |Cite
|
Sign up to set email alerts
|

ESTER: A Machine Reading Comprehension Dataset for Reasoning about Event Semantic Relations

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2
1

Relationship

2
8

Authors

Journals

citations
Cited by 17 publications
(7 citation statements)
references
References 18 publications
0
7
0
Order By: Relevance
“…Rogers et al [33] proposes an "evidence format" for the explainable part of a dataset composed of Modality (Unstructured text, Semi-structured text, Structured knowledge, Images, Audio, Video, Other combinations) and Amount of evidence (Single source, Multiple sources, Partial source, No sources). (a) spatial reasoning: bAbI [107], SpartQA [108] (b) temporal reasoning: event order (QuAIL [109], TORQUE [110]), event attribution to time (TEQUILA [111], TempQuestions [112], script knowledge (MCScript [113]), event duration (MCTACO [114], QuAIL [109]), temporal commonsense knowledge (MCTACO [114], TIMEDIAL [115]), factoid/news questions with answers where the correct answers change with time (ArchivalQA [116], SituatedQA [117]), temporal reasoning in multimodal setting [DAGA [118], TGIF-QA [119]; (c) belief states: Event2Mind [120], QuAIL [109]; (d) causal relations: ROPES [121], QuAIL [109], QuaRTz [122], ESTER [123]; (e) other relations between events: subevents, conditionals, counterfactuals etc. ESTER [123]; (f) entity properties and relations : 20 social interactions (SocialIQa [124]), properties of characters (QuAIL [109]), physical properties (PIQA [125], QuaRel [126]), numerical properties (NumberSense [127]); (g) tracking entities: across locations (bAbI [arXiv:1502.05698]), in coreference chains (Quoref [128],…”
Section: Big Bench Datasets Formentioning
confidence: 99%
“…Rogers et al [33] proposes an "evidence format" for the explainable part of a dataset composed of Modality (Unstructured text, Semi-structured text, Structured knowledge, Images, Audio, Video, Other combinations) and Amount of evidence (Single source, Multiple sources, Partial source, No sources). (a) spatial reasoning: bAbI [107], SpartQA [108] (b) temporal reasoning: event order (QuAIL [109], TORQUE [110]), event attribution to time (TEQUILA [111], TempQuestions [112], script knowledge (MCScript [113]), event duration (MCTACO [114], QuAIL [109]), temporal commonsense knowledge (MCTACO [114], TIMEDIAL [115]), factoid/news questions with answers where the correct answers change with time (ArchivalQA [116], SituatedQA [117]), temporal reasoning in multimodal setting [DAGA [118], TGIF-QA [119]; (c) belief states: Event2Mind [120], QuAIL [109]; (d) causal relations: ROPES [121], QuAIL [109], QuaRTz [122], ESTER [123]; (e) other relations between events: subevents, conditionals, counterfactuals etc. ESTER [123]; (f) entity properties and relations : 20 social interactions (SocialIQa [124]), properties of characters (QuAIL [109]), physical properties (PIQA [125], QuaRel [126]), numerical properties (NumberSense [127]); (g) tracking entities: across locations (bAbI [arXiv:1502.05698]), in coreference chains (Quoref [128],…”
Section: Big Bench Datasets Formentioning
confidence: 99%
“…These datasets are in different formats such as NLI, Question Answering (QA), and Reading Comprehension (RC). They target a large set of skills including monotonicity (Yanaka et al, 2019a), deductive logic , event semantics (Han et al, 2021), physical and social commonsense (Sap et al, 2019;Bisk et al, 2019), defeasible reasoning (Rudinger et al, 2020), and more. Our work brings together a set of challenge datasets to build a benchmark covering a large set of specific linguistic skills.…”
Section: Related Workmentioning
confidence: 99%
“…When study units are organized textual data, we find it meaningful to further divide observed covariates into two broad categories: "explicit observed covariates" that could be derived from the organized textual data at face value, e.g., the number of theorems/equations/figures in a conference paper, and "implicit observed covariates" that capture deeper aspects intrinsic to the textual data. Some concrete examples of implicit covariates include: bag-of-words embeddings such as Word2Vec (Mikolov et al, 2013) and GloVe (Pennington et al, 2014), and contextual embeddings such as BERT (Devlin et al, 2019) and Sen-tenceBERT (Reimers and Gurevych, 2019); perceived sentiments, tones, and emotions from the text (Barbieri et al, 2020;Pérez et al, 2021); topic modeling and keyword summarizing (Xie et al, 2015;Blei and Lafferty, 2007;Ramage et al, 2009;Wang et al, 2020;Santosh et al, 2020); evaluated trustworthiness of the claims made (Nadeem et al, 2019;Zhang et al, 2021b); temporal relationships and semantic relationships of events mentioned (Zhou et al, 2021;Han et al, 2021); commonsense knowledge reasoning (such as complex relations between events, consequences, and predictions) based on the text (Chaturvedi et al, 2017;Speer et al, 2017;Hwang et al, 2021;Jiang et al, 2021). These are by no means exhaustive; nor are they necessary for each and every causal query.…”
Section: A Dichotomy Of Covariatesmentioning
confidence: 99%