Proceedings of the 2019 Conference of the North 2019
DOI: 10.18653/v1/n19-1421
|View full text |Cite
|
Sign up to set email alerts
|

Untitled

Abstract: When answering a question, people often draw upon their rich world knowledge in addition to the particular context. Recent work has focused primarily on answering questions given some relevant document or context, and required very little general background. To investigate question answering with prior knowledge, we present COMMONSENSEQA: a challenging new dataset for commonsense question answering. To capture common sense beyond associations, we extract from CON-CEPTNET (Speer et al., 2017) multiple target co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
264
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 321 publications
(267 citation statements)
references
References 21 publications
3
264
0
Order By: Relevance
“…Across NLP, a lot of work has been published around different kinds of reasonings. To name a few, common sense (Talmor et al, 2019), temporal (Zhou et al, 2019), numerical (Naik et al, 2019;Wallace et al, 2019b) and multi-hop (Khashabi et al, 2018) reasoning have all garnered immense research interest.…”
Section: Resultsmentioning
confidence: 99%
“…Across NLP, a lot of work has been published around different kinds of reasonings. To name a few, common sense (Talmor et al, 2019), temporal (Zhou et al, 2019), numerical (Naik et al, 2019;Wallace et al, 2019b) and multi-hop (Khashabi et al, 2018) reasoning have all garnered immense research interest.…”
Section: Resultsmentioning
confidence: 99%
“…While recent advances in large-scale neural language models Radford et al, 2019;Raffel et al, 2019) have led to strong performance on several commonsense reasoning benchmarks (Talmor et al, 2019;Lv et al, 2020;Sakaguchi et al, 2020), their accuracy by and large depends on the availability of large-scale human-authored training data. However, crowdsourcing examples at scale for each new task and domain can be prohibitively expensive.…”
Section: Introductionmentioning
confidence: 99%
“…Using this input format, we apply a new bidirectional word-level scoring function that leverages the MLM head (Devlin et al, 2019) tuned during the pre-training phase (see Figure 1 for an overview of the proposed approach). This method produces strong zero-shot 1 baselines on the COPA (Gordon et al, 2012), Swag (Zellers et al, 2018), HellaSwag (Zellers et al, 2019) and CommonsenseQA (Talmor et al, 2019) datasets. Then, we fine-tune this new scoring function with a margin-based loss as proposed in .…”
Section: Introductionmentioning
confidence: 99%