2022
DOI: 10.48550/arxiv.2201.13433
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Third Time's the Charm? Image and Video Editing with StyleGAN3

Abstract: StyleGAN is arguably one of the most intriguing and well-studied generative models, demonstrating impressive performance in image generation, inversion, and manipulation. In this work, we explore the recent StyleGAN3 architecture, compare it to its predecessor, and investigate its unique advantages, as well as drawbacks. In particular, we demonstrate that while StyleGAN3 can be trained on unaligned data, one can still use aligned data for training, without hindering the ability to generate unaligned imagery. N… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(9 citation statements)
references
References 54 publications
0
9
0
Order By: Relevance
“…Another challenge can be found in the texture-sticking phenomenon observed in StyleGAN1 and StyleGAN2 [Karras et al 2021], which hinders the realism of generated and manipulated videos. To overcome this, Alaluf et al [2022] combine the PTI [Roich et al 2021] and ReStyle [Alaluf et al 2021b] encoding techniques for encoding and editing videos with the StyleGAN3 [Karras et al 2021] generator. Further leveraging the equivariance of StyleGAN3, they demonstrate the ability to expand the field of view when working on a video with a cropped subject resulting in more uniform video editing.…”
Section: Gan Inversionmentioning
confidence: 99%
“…Another challenge can be found in the texture-sticking phenomenon observed in StyleGAN1 and StyleGAN2 [Karras et al 2021], which hinders the realism of generated and manipulated videos. To overcome this, Alaluf et al [2022] combine the PTI [Roich et al 2021] and ReStyle [Alaluf et al 2021b] encoding techniques for encoding and editing videos with the StyleGAN3 [Karras et al 2021] generator. Further leveraging the equivariance of StyleGAN3, they demonstrate the ability to expand the field of view when working on a video with a cropped subject resulting in more uniform video editing.…”
Section: Gan Inversionmentioning
confidence: 99%
“…3, left). However, this change also modifies the latent space after training, which in turn damages editability 4 .…”
Section: Overparameterized Inversionmentioning
confidence: 99%
“…In particular, the extension to W+ uses a different latent code for each layer. This change is a key component of most inversion pipelines [4,15,41,49,57] as it improves reconstruction quality tremendously. But it comes at a price: because the latent space is altered after training, editability suffers.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, I2I applications are becoming widespread, including a plethora of diverse tasks such as attribute manipulation [1]- [4], sketch-to-image [5], [6], style transfer [7], [8], semantic synthesis [9], [10], and others [11]- [14]. Among these, generative adversarial networks (GANs) [15] are particularly suitable for this task due to few restrictions on the generator network.…”
Section: Introductionmentioning
confidence: 99%