Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing 2017
DOI: 10.18653/v1/d17-1070
|View full text |Cite
|
Sign up to set email alerts
|

Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

Abstract: Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on large corpora, as base features. Efforts to obtain embeddings for larger chunks of text, such as sentences, have however not been so successful. Several attempts at learning unsupervised representations of sentences have not reached satisfactory enough performance to be widely adopted. In this paper, we show how universal sentence representations trained using the supervised data of the Stanford Natural Language In… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

30
1,719
1
4

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 1,692 publications
(1,754 citation statements)
references
References 32 publications
30
1,719
1
4
Order By: Relevance
“…Sentence embedding from bi-directional LSTM trained on SNLI (Conneau et al, 2017) 80.1 75.8 C-PHRASE Prediction of syntactic constituent context words (Pham et al, 2015) 74.3 63.9 PV-DBOW Paragraph vectors, Doc2Vec DBOW (Le and Mikolov, 2014;Lau and Baldwin, 2016) 72.2 64.9 Averaged Word Embedding Baselines LexVec Weighted matrix factorization of PPMI (Salle et al, 2016a,b) 68.9 55.8 FastText Skip-gram with sub-word character n-grams (Joulin et al, 2016) 65.3 53.6 Paragram Paraphrase Database (PPDB) fit word embeddings (Wieting et al, 2015) 63.0 50.1 GloVe Word co-occurrence count fit embeddings (Pennington et al, 2014) 52.4 40.6 Word2vec…”
Section: Sts Benchmarkmentioning
confidence: 99%
“…Sentence embedding from bi-directional LSTM trained on SNLI (Conneau et al, 2017) 80.1 75.8 C-PHRASE Prediction of syntactic constituent context words (Pham et al, 2015) 74.3 63.9 PV-DBOW Paragraph vectors, Doc2Vec DBOW (Le and Mikolov, 2014;Lau and Baldwin, 2016) 72.2 64.9 Averaged Word Embedding Baselines LexVec Weighted matrix factorization of PPMI (Salle et al, 2016a,b) 68.9 55.8 FastText Skip-gram with sub-word character n-grams (Joulin et al, 2016) 65.3 53.6 Paragram Paraphrase Database (PPDB) fit word embeddings (Wieting et al, 2015) 63.0 50.1 GloVe Word co-occurrence count fit embeddings (Pennington et al, 2014) 52.4 40.6 Word2vec…”
Section: Sts Benchmarkmentioning
confidence: 99%
“…Sentence embeddings Our input sentences x are sentence embeddings obtained by a pretrained sentence encoder (Conneau et al, 2017) (this is different from the sentence encoder in our model). The pretrained sentence encoder is a BiLSTM with max pooling trained on the Stanford Natural Language Inference corpus (Bowman et al, 2015) for textual entailment.…”
Section: Inputsmentioning
confidence: 99%
“…The pretrained sentence encoder is a BiLSTM with max pooling trained on the Stanford Natural Language Inference corpus (Bowman et al, 2015) for textual entailment. Sentence embeddings from this encoder, combined with logistic regression on top, showed good performance in various transfer tasks, such as entailment and caption-image retrieval (Conneau et al, 2017).…”
Section: Inputsmentioning
confidence: 99%
“…These networks are capable of "memorizing" information, thus being able to better represent longer segments of text, without the danger of vanishing/exploding gradients encountered in traditional, normal recurrent neural networks [50]. These types of networks have been successfully used in most NLP tasks [51].…”
Section: Deep Neural Network and Summary Evaluationmentioning
confidence: 99%