Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.122
|View full text |Cite
|
Sign up to set email alerts
|

A Bilingual Generative Transformer for Semantic Sentence Embedding

Abstract: Semantic sentence embedding models encode natural language sentences into vectors, such that closeness in embedding space indicates closeness in the semantics between the sentences. Bilingual data offers a useful signal for learning such embeddings: properties shared by both sentences in a translation pair are likely semantic, while divergent properties are likely stylistic or language-specific. We propose a deep latent variable model that attempts to perform source separation on parallel sentences, isolating … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(11 citation statements)
references
References 32 publications
0
11
0
Order By: Relevance
“…In Table 10, we compare our sentence embedding method to previous approaches including BERT (Mean) (Devlin et al, 2018), InferSent (Conneau et al, 2017), GenSen (Subramanian et al, 2018), USE (Cer et al, 2018), Sentence-BERT (Reimers and Gurevych, 2019), uSIF (Ethayarajh, 2018a), Charagram (Wieting and Gimpel, 2017) and BGT (Wieting et al, 2019b). On average, our embeddings outperform previous approaches by 0.2% on STS 2012 to 2016 (Agirre et al, 2012(Agirre et al, , 2013(Agirre et al, , 2014(Agirre et al, , 2015(Agirre et al, , 2016, and by 0.9% on STS-Benchmark (Cer et al, 2017…”
Section: Sentence Embeddings (Sase)mentioning
confidence: 99%
“…In Table 10, we compare our sentence embedding method to previous approaches including BERT (Mean) (Devlin et al, 2018), InferSent (Conneau et al, 2017), GenSen (Subramanian et al, 2018), USE (Cer et al, 2018), Sentence-BERT (Reimers and Gurevych, 2019), uSIF (Ethayarajh, 2018a), Charagram (Wieting and Gimpel, 2017) and BGT (Wieting et al, 2019b). On average, our embeddings outperform previous approaches by 0.2% on STS 2012 to 2016 (Agirre et al, 2012(Agirre et al, , 2013(Agirre et al, , 2014(Agirre et al, , 2015(Agirre et al, , 2016, and by 0.9% on STS-Benchmark (Cer et al, 2017…”
Section: Sentence Embeddings (Sase)mentioning
confidence: 99%
“…Various sentence embedding models have been proposed in recent years. Most of these models utilize supervision from parallel data Artetxe and Schwenk, 2019b;Wieting et al, 2019Wieting et al, , 2020, natural language inference data (Conneau et al, 2017;Cer et al, 2018;Reimers and Gurevych, 2019), or a combination of both (Subramanian et al, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…The constituency parse trees of all sentences are obtained from Stanford CoreNLP (Manning et al, 2014). We finetune a 6-layer BART base encoder as the semantic Wieting et al (2020). *BGT is evaluated on an additional dataset from STS13, which is not included in the standard SentEval toolkit.…”
Section: Setupmentioning
confidence: 99%
See 2 more Smart Citations