2015
DOI: 10.48550/arxiv.1511.06349
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Generating Sentences from a Continuous Space

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
290
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 167 publications
(292 citation statements)
references
References 0 publications
2
290
0
Order By: Relevance
“…Training VRAG involves optimizing two objectives -reducing the KL-divergence between the document-prior and document-posterior, and, maximizing the log likelihood of the responses. VAE models often end up prioritizing the KL-divergence over the likelihood objective and sometimes end up with zero KLdivergence by forcing the document-posterior to match the prior (called posterior-collapse) (Lucas et al 2019;Bowman et al 2015;Chen et al 2016;Oord, Vinyals, and Kavukcuoglu 2017). However, we hypothesize even in cases where there is no posterior collapse, the joint training could result in the response-generator (likelihood term) being inadequately trained.…”
Section: Effect Of Decoder Fine-tuningmentioning
confidence: 89%
See 1 more Smart Citation
“…Training VRAG involves optimizing two objectives -reducing the KL-divergence between the document-prior and document-posterior, and, maximizing the log likelihood of the responses. VAE models often end up prioritizing the KL-divergence over the likelihood objective and sometimes end up with zero KLdivergence by forcing the document-posterior to match the prior (called posterior-collapse) (Lucas et al 2019;Bowman et al 2015;Chen et al 2016;Oord, Vinyals, and Kavukcuoglu 2017). However, we hypothesize even in cases where there is no posterior collapse, the joint training could result in the response-generator (likelihood term) being inadequately trained.…”
Section: Effect Of Decoder Fine-tuningmentioning
confidence: 89%
“…These models are trained end-to-end by maximizing the likelihood of responses on a given training set of conversational logs (Sordoni et al 2015). However real-world conversational systems also need to be able to incorporate external (structured or unstructured) knowledge while generating responses (Gangi Reddy et al 2019;Budzianowski et al 2018;Qu et al 2020;. Existing work for knowledge grounded tasks typically relies on the availability of conversation logs and the associated knowledge source instance (Qu et al 2020;Gangi Reddy et al 2019), though some recent methods relax the dependence on annotated logs (Raghu, Gupta et al 2020;Lewis et al 2020;Guu et al 2020).…”
Section: Introductionmentioning
confidence: 99%
“…By contrast, we aim to generate a sequence from a latent space whose dimension is invariant of sequence length. In this respect, the proposed method is similar to RNNbased variational autoencoders (VAE) [26], [27]. However, the assumption that the latent distribution is Gaussian is not always realistic and restricts the generation performance because realworld data may follow a much more complex distribution.…”
Section: Related Workmentioning
confidence: 99%
“…Another category of representation learning methods is VAE and its variants [22], [23]. Bowman et al introduce an RNN-based VAE to model the latent properties of sentences [24], which inspires us to learn the traits of drivers from their trajectories. Conditional VAEs (CVAE) are widely used in pedestrian and vehicle trajectory predictions since discrete latent states can represent different behavior modes such as braking and turning [25]- [28].…”
Section: B Representation Learning For Sequential Datamentioning
confidence: 99%