2016
DOI: 10.48550/arxiv.1605.06197
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Stick-Breaking Variational Autoencoders

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
20
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(20 citation statements)
references
References 0 publications
0
20
0
Order By: Relevance
“…For NP and ANP trained on functions generated from GP, we illustrate the weight norm of the decoding layer right behind the latent variables in Figure 5. The sparsely-coded decoder implies the redundancy of the stochastic path due to the component collapsing behavior referred to in [40,24]. This phenomenon can be explained by the information preference problem [7,73] where the information flow is concentrated on the deterministic path with the tendency to ignore the stochastic path.…”
Section: Encoder-decoder Pipelinementioning
confidence: 99%
“…For NP and ANP trained on functions generated from GP, we illustrate the weight norm of the decoding layer right behind the latent variables in Figure 5. The sparsely-coded decoder implies the redundancy of the stochastic path due to the component collapsing behavior referred to in [40,24]. This phenomenon can be explained by the information preference problem [7,73] where the information flow is concentrated on the deterministic path with the tendency to ignore the stochastic path.…”
Section: Encoder-decoder Pipelinementioning
confidence: 99%
“…As for the prior, the standard VAE takes the normal distribution as the prior, which may lead to the phenomena of over-regularization and posterior collapse and further affect the performance for density estimation [5,11]. In the early stage, VAEs apply more complex priors, such as the Dirichlet process prior [18], the Chinese Restaurant Process prior [19], to improve the capacity of the variational posterior. However, these methods can only be trained with specific tricks and learning methods.…”
Section: Bayesian Pseudocoresets Exemplar Vaementioning
confidence: 99%
“…The simple CDF of Kuma also makes the reparameterization trick easily applicble [51,79,62]. Lastly, KL-divergence between Kuma and Beta distribution can be approximated in closed form [70]. We fix βt = 1 to ease optimization since the Kuma and Beta distributions coincide when…”
Section: Recurrent Variational Model With Differentiable Binary Laten...mentioning
confidence: 99%
“…The nodes, termed pivotal states, are the most critical states in recovering action trajectories [17,47,32]. In particular, given a set of trajectories, we optimize a fully differentiable recurrent variational auto-encoder [19,34,51] with binary latent variables [70]. Each binary latent variable is designated to a state and the prior distribution learned conditioning on that state indicates whether it belongs to the set of pivotal states.…”
Section: Introductionmentioning
confidence: 99%