Character-level Intra Attention Network for Natural Language Inference

Han, Yang; Costa-jussà, Marta R.; Fonollosa, José A. R.

doi:10.18653/v1/w17-5309

Proceedings of the 2nd Workshop on Evaluating Vector Space Representations for NLP 2017

DOI: 10.18653/v1/w17-5309

|View full text |Cite

Character-level Intra Attention Network for Natural Language Inference

Yang Han¹,

Marta R. Costa-jussà

José A. R. Fonollosa

Abstract: Natural language inference (NLI) is a central problem in language understanding. End-to-end artificial neural networks have reached state-of-the-art performance in NLI field recently.In this paper, we propose Characterlevel Intra Attention Network (CIAN) for the NLI task. In our model, we use the character-level convolutional network to replace the standard word embedding layer, and we use the intra attention to capture the intra-sentence semantics. The proposed CIAN model provides improved results based on a … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2017

Publication Types

Select...

Other2

Relationship

Self Cite0

Independent2

Authors

Journals

Cited by 2 publications

(1 citation statement)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…RepEval 2017 Cha-level Intra-attention BiLSTM encoders(Yang et al, 2017) Experimental results of different models on MultiNLI data. SNLI Mix : use of SNLI training dataset.…”

mentioning

confidence: 99%

Distance-based Self-Attention Network for Natural Language Inference

Im,

Cho

2017

Preprint

View full text Add to dashboard Cite

Attention mechanism has been used as an ancillary means to help RNN or CNN. However, the Transformer (Vaswani et al., 2017) recently recorded the state-of-theart performance in machine translation with a dramatic reduction in training time by solely using attention. Motivated by the Transformer, Directional Self Attention Network (Shen et al., 2017), a fully attention-based sentence encoder, was proposed. It showed good performance with various data by using forward and backward directional information in a sentence. But in their study, not considered at all was the distance between words, an important feature when learning the local dependency to help understand the context of input text. We propose Distance-based Self-Attention Network, which considers the word distance by using a simple distance mask in order to model the local dependency without losing the ability of modeling global dependency which attention has inherent. Our model shows good performance with NLI data, and it records the new state-of-the-art result with SNLI data. Additionally, we show that our model has a strength in long sentences or documents. * The NLI task can be solved through two different approaches: sentence encoding-based models and joint models.The former separately encode each sentence, whereas the latter take into account the direct relationship between two sentences. Between them, sentence-encoding based models focus on training sentence encoder that can represent sentences in vector form well. We focus on the former approach, since the objective of our work is to develop an advanced sentenceencoding model.

show abstract

“…RepEval 2017 Cha-level Intra-attention BiLSTM encoders(Yang et al, 2017) Experimental results of different models on MultiNLI data. SNLI Mix : use of SNLI training dataset.…”

mentioning

confidence: 99%

Distance-based Self-Attention Network for Natural Language Inference

Im,

Cho

2017

Preprint

View full text Add to dashboard Cite

show abstract

The RepEval 2017 Shared Task: Multi-Genre Natural Language Inference with Sentence Representations

Nangia

Williams

Lazaridou

et al. 2017

Proceedings of the 2nd Workshop on Evaluating Vector Space Representations for NLP

View full text Add to dashboard Cite

This paper presents the results of the RepEval 2017 Shared Task, which evaluated neural network sentence representation learning models on the MultiGenre Natural Language Inference corpus (MultiNLI) recently introduced by Williams et al. (2017). All of the five participating teams beat the bidirectional LSTM (BiLSTM) and continuous bag of words baselines reported in Williams et al.. The best single model used stacked BiLSTMs with residual connections to extract sentence features and reached 74.5% accuracy on the genre-matched test set. Surprisingly, the results of the competition were fairly consistent across the genrematched and genre-mismatched test sets, and across subsets of the test data representing a variety of linguistic phenomena, suggesting that all of the submitted systems learned reasonably domainindependent representations for sentence meaning.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Character-level Intra Attention Network for Natural Language Inference

Cited by 2 publications

References 11 publications

Distance-based Self-Attention Network for Natural Language Inference

Distance-based Self-Attention Network for Natural Language Inference

The RepEval 2017 Shared Task: Multi-Genre Natural Language Inference with Sentence Representations

Contact Info

Product

Resources

About