Proceedings of the 2nd Workshop on Evaluating Vector Space Representations for NLP 2017
DOI: 10.18653/v1/w17-5307
|View full text |Cite
|
Sign up to set email alerts
|

Recurrent Neural Network-Based Sentence Encoder with Gated Attention for Natural Language Inference

Abstract: The RepEval 2017 Shared Task aims to evaluate natural language understanding models for sentence representation, in which a sentence is represented as a fixedlength vector with neural networks and the quality of the representation is tested with a natural language inference task. This paper describes our system (alpha) that is ranked among the top in the Shared Task, on both the in-domain test set (obtaining a 74.9% accuracy) and on the crossdomain test set (also attaining a 74.9% accuracy), demonstrating that… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
37
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
4
1

Relationship

2
8

Authors

Journals

citations
Cited by 76 publications
(37 citation statements)
references
References 20 publications
0
37
0
Order By: Relevance
“…Sentence-encoding based models use the Siamese architecture (Bromley et al, 1993;Chen et al, 2017b) shown in Figure 2 (a). Parameter-tied neural networks are applied to encode both the context and the response.…”
Section: Sentence-encoding Based Methodsmentioning
confidence: 99%
“…Sentence-encoding based models use the Siamese architecture (Bromley et al, 1993;Chen et al, 2017b) shown in Figure 2 (a). Parameter-tied neural networks are applied to encode both the context and the response.…”
Section: Sentence-encoding Based Methodsmentioning
confidence: 99%
“…We evaluate the proposed AR-Tree on three tasks: natural language inference, sentence sentiment analysis, and author profiling. (Bowman et al 2016) 3.7m 83.2 300D NSE (Munkhdalai and Yu 2017a) 6.3m 84.8 300D NTI-SLSTM-LSTM (Munkhdalai and Yu 2017b) 4.0m 83.4 300D Gumbel Tree-LSTM (Choi, Yoo, and goo Lee 2017) 2.9m 85.0 300D Self-Attentive (Lin et al 2017) 4.1m 84.4 300D Tf-idf Tree-LSTM (Ours) 3.5m 84.5 300D AR-Tree (Ours) 3.6m 85.5 600D Gated-Attention BiLSTM (Chen et al 2017) 11.6m 85.5 300D Decomposable attention (Parikh et al 2016) 582k 86.8 300D NTI-SLSTM-LSTM global attention (Munkhdalai and Yu 2017b) 3.2m 87.3 300D Structured Attention (Kim et al 2017) 2.4m 86.8 We set α = 0.1, λ = 1e − 5 in Eq. 4 through all experiments.…”
Section: Methodsmentioning
confidence: 99%
“…In this study, we apply the one used in the attention module proposed in [16]. Dimension-wise features-based matching [43] is commonly used in many models such as [47], [48] for the task of natural language inference. The output of the matching step is scalar; therefore, we use a scoring layer which yields the input scalar value as it is.…”
Section: Reduce-match Modelsmentioning
confidence: 99%