Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing 2016
DOI: 10.18653/v1/d16-1031
|View full text |Cite
|
Sign up to set email alerts
|

Language as a Latent Variable: Discrete Generative Models for Sentence Compression

Abstract: In this work we explore deep generative models of text in which the latent representation of a document is itself drawn from a discrete language model distribution. We formulate a variational auto-encoder for inference in this model and apply it to the task of compressing sentences. In this application the generative model first draws a latent summary sentence from a background language model, and then subsequently draws the observed sentence conditioned on this latent summary. In our empirical evaluation we s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
195
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 192 publications
(197 citation statements)
references
References 6 publications
2
195
0
Order By: Relevance
“…The difficulty of providing sufficient supervision has motivated work on semi-supervised and unsupervised learning for many of these tasks (McClosky et al, 2006;Spitkovsky et al, 2010;Subramanya et al, 2010;Stratos and Collins, 2015;Marinho et al, 2016;Tran et al, 2016), including several that also used autoencoders (Ammar et al, 2014;Lin et al, 2015;Miao and Blunsom, 2016;Kociský et al, 2016;Cheng et al, 2017). In this paper we expand on these works, and suggest a neural CRF autoencoder, that can leverage both labeled and unlabeled data.…”
Section: Related Workmentioning
confidence: 99%
“…The difficulty of providing sufficient supervision has motivated work on semi-supervised and unsupervised learning for many of these tasks (McClosky et al, 2006;Spitkovsky et al, 2010;Subramanya et al, 2010;Stratos and Collins, 2015;Marinho et al, 2016;Tran et al, 2016), including several that also used autoencoders (Ammar et al, 2014;Lin et al, 2015;Miao and Blunsom, 2016;Kociský et al, 2016;Cheng et al, 2017). In this paper we expand on these works, and suggest a neural CRF autoencoder, that can leverage both labeled and unlabeled data.…”
Section: Related Workmentioning
confidence: 99%
“…At the early stage of training, we set λ to be zero and let the model first figure out how to project the representation of the source sequence to a roughly right point in the space and then regularize it with the KL cost. This technique can also be seen in (Kočiskỳ et al, 2016;Miao and Blunsom, 2016). Input Dropout in the Decoder: Besides annealing the KL cost, we also randomly drop out the input token with a probability of β at each time step of the decoder during learning.…”
Section: Learning Continuous Latent Variablesmentioning
confidence: 99%
“…At the early stage of training, we set λ to be zero and let the model first figure out how to project the representation of the source sequence to a roughly right point in the space and then regularize it with the KL cost. This technique can also be seen in (Kočiskỳ et al, 2016;Miao and Blunsom, 2016). Input Dropout in the Decoder: Besides annealing the KL cost, we also randomly drop out the input token with a probability of β at each time step of the decoder during learning.…”
Section: Learning Continuous Latent Variablesmentioning
confidence: 99%