Proceedings of the ACL 2016 Student Research Workshop 2016
DOI: 10.18653/v1/p16-3020
|View full text |Cite
|
Sign up to set email alerts
|

QA-It: Classifying Non-Referential It for Question Answer Pairs

Abstract: This paper introduces a new corpus, QA-It, for the classification of non-referential it. Our dataset is unique in a sense that it is annotated on question answer pairs collected from multiple genres, useful for developing advanced QA systems. Our annotation scheme makes clear distinctions between 4 types of it, providing guidelines for many erroneous cases. Several statistical models are built for the classification of it, showing encouraging results. To the best of our knowledge, this is the first time that s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
7
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(10 citation statements)
references
References 6 publications
3
7
0
Order By: Relevance
“…If there are event-favoring properties of the context sentence that human participants are sensitive to, it is a tractable task to build automatic classifiers that learn to recognize such properties. This supports the idea that the task of differentiating anaphoric and pleonastic instances of It (Evans, 2001;Boyd et al, 2005;Bergsma and Yarowsky, 2011;Lee et al, 2016;Loáiciga et al, 2017) could potentially improve performance.…”
Section: Discussionsupporting
confidence: 73%
“…If there are event-favoring properties of the context sentence that human participants are sensitive to, it is a tractable task to build automatic classifiers that learn to recognize such properties. This supports the idea that the task of differentiating anaphoric and pleonastic instances of It (Evans, 2001;Boyd et al, 2005;Bergsma and Yarowsky, 2011;Lee et al, 2016;Loáiciga et al, 2017) could potentially improve performance.…”
Section: Discussionsupporting
confidence: 73%
“…In all three classifiers, the clause-anaphoric class was consistently predicted with lowest accuracy (Table 5), which is not surprising given that it only accounts for 11% of the retained data. In line with the observation of Lee et al (2016) (Section 2), the embeddings do not show a stable contribution. In our case, this is likely related to the small amount of data, to which we add 300 dimensions per word.…”
Section: Resultssupporting
confidence: 83%
“…The inter-annotator agreement for the three categories was κ = 0.636, p < 0.0005, indicating substantial agreement between the annotators. This number corresponds to a percentage agreement of 77.47% for the three categories and is comparable to the 81% agreement reported in Lee et al (2016). We perceived adjudication between cases of disagreement (237 instances) to be extremely arbitrary, so those cases were excluded rather than resolved.…”
Section: Datasupporting
confidence: 75%
See 2 more Smart Citations