2018 IEEE Winter Conference on Applications of Computer Vision (WACV) 2018
DOI: 10.1109/wacv.2018.00136
|View full text |Cite
|
Sign up to set email alerts
|

Channel-Recurrent Autoencoding for Image Modeling

Abstract: proposed) 128x128 Stage2 generation VAE Figure 1: Comparison demonstrating our channel-recurrent VAE-GAN's superior ability to model complex bird images. Based on the high-quality generation of Stage1 64×64 images, higher-resolution Stage2 images can be further synthesized unsupervisedly. AbstractDespite recent successes in synthesizing faces and bedrooms, existing generative models struggle to capture more complex image types (Figure 1), potentially due to the oversimplification of their latent space construc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 10 publications
0
5
0
Order By: Relevance
“…VAE-GAN hybrids [76][77][78][79][80] function by combining a VAE's encoder into the GANs generator network, creating an unsupervised generative model capable of encoding, generating, and discriminating image samples. The goal of the combined network is to improve image sample quality, representation learning, sample diversity, and training stability compared to individual VAE and GAN models.…”
Section: Generative Adversarial Network Variational Autoencoders and ...mentioning
confidence: 99%
“…VAE-GAN hybrids [76][77][78][79][80] function by combining a VAE's encoder into the GANs generator network, creating an unsupervised generative model capable of encoding, generating, and discriminating image samples. The goal of the combined network is to improve image sample quality, representation learning, sample diversity, and training stability compared to individual VAE and GAN models.…”
Section: Generative Adversarial Network Variational Autoencoders and ...mentioning
confidence: 99%
“…Similarly to the VAE models for text/dialogue generation mentioned above, VAEs for image modeling generally apply many-to-one encoding and one-to-many decoding: They encode a whole image (that is a large number of pixels) in a single low-dimensional latent vector. Also they rely on knowledge accumulated on the use of deep neural networks for image processing, e.g., the decomposition of an image into successive feature maps using successive convolution layers and corresponding recomposition process, possibly combined with a multi-level or hierarchical latent encoding (Gulrajani et al, 2016;Shang et al, 2018). This makes those models a bit aside the temporal models we focus on, even if, as for text/dialogue generation VAEs, some propositions made in the literature on VAEs for image generation can be exploited in a more general framework.…”
Section: Aim Of the Papermentioning
confidence: 99%
“…Besides, Sohn et al [36] also propose cVAE, whose encoder and decoder networks, totally different from standard VAE, are thoroughly convolutional. However, this kind of convolutional architecture (Figure 1(b)) produce disordered samples due to the latent representation sampled from spatially independent prior ignores the global structure of input images [37].…”
Section: A Analysis Of Encoder-decodr Architecturementioning
confidence: 99%
“…Consequently, we can find that the encoder-decoder architecture of these methods [13]- [15] proposed for facial attribute modification task is unsuitable for generating realistic new faces. Conversely, the latent representation of double-task models [18]- [20] may lack the capacity to model the complex data distribution [37], resulting in unexpected changes in modified images.…”
Section: A Analysis Of Encoder-decodr Architecturementioning
confidence: 99%