Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 1: Long Papers) 2017
DOI: 10.18653/v1/p17-1168
|View full text |Cite
|
Sign up to set email alerts
|

Gated-Attention Readers for Text Comprehension

Abstract: In this paper we study the problem of answering cloze-style questions over documents. Our model, the Gated-Attention (GA) Reader 1 , integrates a multi-hop architecture with a novel attention mechanism, which is based on multiplicative interactions between the query embedding and the intermediate states of a recurrent neural network document reader. This enables the reader to build query-specific representations of tokens in the document for accurate answer selection. The GA Reader obtains state-of-the-art res… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
352
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 305 publications
(353 citation statements)
references
References 22 publications
1
352
0
Order By: Relevance
“…This would allow us to tackle the problem of the emote detection as a sequence modeling task, and this will be more natural as it is not easy to predict an emote of a message with no context. Finally, as Barbieri et al (2017) we plan to investigate character-based approaches to represent words and/or messages (Dhingra et al, 2016) For brands, gaining new customer is more expensive than keeping an existing one. Therefore, the ability to keep customers in a brand is becoming more challenging these days.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…This would allow us to tackle the problem of the emote detection as a sequence modeling task, and this will be more natural as it is not easy to predict an emote of a message with no context. Finally, as Barbieri et al (2017) we plan to investigate character-based approaches to represent words and/or messages (Dhingra et al, 2016) For brands, gaining new customer is more expensive than keeping an existing one. Therefore, the ability to keep customers in a brand is becoming more challenging these days.…”
Section: Discussionmentioning
confidence: 99%
“…Given the proven effectiveness of recurrent neural networks in different tasks (Chung et al, 2014;Vinyals et al, 2015;Bahdanau et al, 2014, interalia), which also includes modeling of tweets (Dhingra et al, 2016;Barbieri et al, 2017), our Emote prediction model is based on RNNs, which are modeled to learn sequential data. We use the word based B-LSTM architecture by Barbieri et al (2017), designed to model emojis in Twitter.…”
Section: Bi-directional Lstmsmentioning
confidence: 99%
See 2 more Smart Citations
“…They implement a multi-hop attention mechanism from question to text (a Gated Attention Reader (Dhingra et al, 2017)). …”
Section: Ynu Deepmentioning
confidence: 99%