Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.442
|View full text |Cite
|
Sign up to set email alerts
|

SubjQA: A Dataset for Subjectivity and Review Comprehension

Abstract: Subjectivity is the expression of internal opinions or beliefs which cannot be objectively observed or verified, and has been shown to be important for sentiment analysis and wordsense disambiguation. Furthermore, subjectivity is an important aspect of user-generated data. In spite of this, subjectivity has not been investigated in contexts where such data is widespread, such as in question answering (QA). We develop a new dataset which allows us to investigate this relationship. We find that subjectivity is a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
2

Relationship

2
5

Authors

Journals

citations
Cited by 19 publications
(12 citation statements)
references
References 31 publications
0
12
0
Order By: Relevance
“…We fine-tune BERT on either SQuAD v2.0 (Rajpurkar et al, 2018a) or SubjQA (Bjerva et al, 2020) before investigating the hidden representations. Since we analyse the similarity of hidden representations across answer span tokens, we only fine-tune BERT on answerable questions.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…We fine-tune BERT on either SQuAD v2.0 (Rajpurkar et al, 2018a) or SubjQA (Bjerva et al, 2020) before investigating the hidden representations. Since we analyse the similarity of hidden representations across answer span tokens, we only fine-tune BERT on answerable questions.…”
Section: Methodsmentioning
confidence: 99%
“…We experiment on two English-language QA datasets: SQuAD v2.0 (Rajpurkar et al, 2018a) and SubjQA (Bjerva et al, 2020). Since SQuAD v2.0 exclusively contains objective questions that belong to a single domain, Wikipedia, we contrast this with the more diverse SubjQA.…”
Section: Datamentioning
confidence: 99%
See 2 more Smart Citations
“…The earlier one was based on a Web crawl of questions and answers about products posed by users [167], and the more recent one (AmazonQA [101]) built upon it by cleaning up the data, and providing review snippets and (automatic) answerability annotation. Sub-jQA [28] is based on reviews from more sources than just Amazon, has manual answerability annotation and, importantly, is the first QA dataset to also include labels for subjectivity of answers.…”
Section: Domainsmentioning
confidence: 99%