Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 1: Long Papers) 2017
DOI: 10.18653/v1/p17-1055
|View full text |Cite
|
Sign up to set email alerts
|

Attention-over-Attention Neural Networks for Reading Comprehension

Abstract: Cloze-style reading comprehension is a representative problem in mining relationship between document and query. In this paper, we present a simple but novel model called attention-over-attention reader for better solving cloze-style reading comprehension task. The proposed model aims to place another attention mechanism over the document-level attention and induces "attended attention" for final answer predictions. One advantage of our model is that it is simpler than related works while giving excellent perf… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
357
0
1

Year Published

2017
2017
2019
2019

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 334 publications
(360 citation statements)
references
References 18 publications
2
357
0
1
Order By: Relevance
“…One basic deep learning architecture for cQA is shown as Figure 1, which is used in a great many papers (Tan et al, 2015;Feng et al, 2015;Cui et al, 2016;Hu et al, 2014). The question and the comment are mapped into fixed-length word vectors.…”
Section: Methodsmentioning
confidence: 99%
“…One basic deep learning architecture for cQA is shown as Figure 1, which is used in a great many papers (Tan et al, 2015;Feng et al, 2015;Cui et al, 2016;Hu et al, 2014). The question and the comment are mapped into fixed-length word vectors.…”
Section: Methodsmentioning
confidence: 99%
“…Our Chengyu cloze test task is similar to reading comprehension (Hermann et al, 2015;Cui et al, 2016;Kadlec et al, 2016;Seo et al, 2016 (Xu et al, 2010) and improve Chinese word segmentation (Chan and Chong, 2008;Sun and Xu, 2011;Wang and Xu, 2017). Chengyus differ from metaphors in other languages (Tsvetkov et al, 2014;Shutova, 2010) because they do not follow the grammatical structure and syntax of the modern Chinese.…”
Section: Related Workmentioning
confidence: 99%
“…The embeddings matrix is regarded as a parameter and will be updated while training the neural network. This method is used by [35,37,42]. Another initialization method is to use publicly available pre-trained embeddings.…”
Section: Random Embeddings Vs Pre-trained Embeddingsmentioning
confidence: 99%