StyleRig: Rigging StyleGAN for 3D Control Over Portrait Images

Tewari, Ayush; Elgharib, Mohamed; Bharaj, Gaurav; Bernard, Florian; Seidel, Hans‐Peter; Pérez, Patrick; Zollhöfer, Michael; Theobalt, Christian

doi:10.1109/cvpr42600.2020.00618

Cited by 376 publications

(241 citation statements)

References 24 publications

Supporting

Mentioning

239

Contrasting

Order By: Relevance

“…Deng et al [2020] imitate the 3D rendering process and introduce contrastive learning to learn a disentangled latent space. Many other works (e.g., Härkönen et al 2020;Shen et al 2020;Tewari et al 2020]) have tried to analyze and disentangle the latent code of some pretrained GAN space ] also with labeled data of specific attributes. Although these works successfully disentangle the latent space, they could only control a limited number of predefined attributes such as gender, expression, and age, due to the use of labeled data in the training stage.…”

Section: Neural Image Disentanglementmentioning

confidence: 99%

DeepFaceEditing

et al. 2021

View full text Add to dashboard Cite

Recent facial image synthesis methods have been mainly based on conditional generative models. Sketch-based conditions can effectively describe the geometry of faces, including the contours of facial components, hair structures, as well as salient edges (e.g., wrinkles) on face surfaces but lack effective control of appearance, which is influenced by color, material, lighting condition, etc. To have more control of generated results, one possible approach is to apply existing disentangling works to disentangle face images into geometry and appearance representations. However, existing disentangling methods are not optimized for human face editing, and cannot achieve fine control of facial details such as wrinkles. To address this issue, we propose DeepFaceEditing, a structured disentanglement framework specifically designed for face images to support face generation and editing with disentangled control of geometry and appearance. We adopt a local-to-global approach to incorporate the face domain knowledge: local component images are decomposed into geometry and appearance representations, which are fused consistently using a global fusion module to improve generation quality. We exploit sketches to assist in extracting a better geometry representation, which also supports intuitive geometry editing via sketching. The resulting method can either extract the geometry and appearance representations from face images, or directly extract the geometry representation from face sketches. Such representations allow users to easily edit and synthesize face images, with decoupled control of their geometry and appearance. Both qualitative and quantitative evaluations show the superior detail and appearance control abilities of our method compared to state-of-the-art methods.

show abstract

Section: Neural Image Disentanglementmentioning

confidence: 99%

DeepFaceEditing

et al. 2021

View full text Add to dashboard Cite

show abstract

“…The work of Shen et al [5] shows that GANs trained on high quality images learn various semantics in some linear subspaces of the latent space. Tewari et al [18] introduced an approach that provides a face rig-like control on generated images by training a rigging network between 3D morphable face model's semantic parameters and StyleGAN's input. Other approaches have attempted to imitate or directly carry out Principal Component Analysis (PCA) in the latent space of generative networks [6,19].…”

Section: Related Workmentioning

confidence: 99%

Learning Non-Linear Disentangled Editing For Stylegan

Yao

Newson

Gousseau

et al. 2021

2021 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

EyeglassesGray Hair Age Hairline Original Slender Smiling Wavy Hair Makeup Sequential disentangled attribute manipulation. We show in this example how to achieve realistic, controllable, disentangled face editing. From the original image (center), we propose two opposite editing directions where only one attribute is manipulated at a time. To the right: 'slender', 'smiling', 'wavy hair' and 'makeup' and to the left: 'receding hairline', 'age', 'gray hair' and 'eyeglasses'. All results are obtained at resolution 1024 2 .

show abstract

“…We build on the StyleGAN architecture [Karras et al 2019[Karras et al , 2020b] that is inherently a 2D image generator. Recent methods [Abdal et al 2021;Tewari et al 2020a] can render head poses parameterized by two angles, but have no way to generate an image for a specific and complete 3D camera, ignoring at least five degrees of freedom (DoF). This results in a small subspace of 3D camera poses; the nature and limits of this subspace have never been precisely defined, much less extended.…”

Section: Introductionmentioning

confidence: 99%

FreeStyleGAN

Leimkühler

Drettakis

2021

ACM Trans. Graph.

View full text Add to dashboard Cite

Current Generative Adversarial Networks (GANs) produce photorealistic renderings of portrait images. Embedding real images into the latent space of such models enables high-level image editing. While recent methods provide considerable semantic control over the (re-)generated images, they can only generate a limited set of viewpoints and cannot explicitly control the camera. Such 3D camera control is required for 3D virtual and mixed reality applications. In our solution, we use a few images of a face to perform 3D reconstruction, and we introduce the notion of the GAN camera manifold, the key element allowing us to precisely define the range of images that the GAN can reproduce in a stable manner. We train a small face-specific neural implicit representation network to map a captured face to this manifold and complement it with a warping scheme to obtain free-viewpoint novel-view synthesis. We show how our approach - due to its precise camera control - enables the integration of a pre-trained StyleGAN into standard 3D rendering pipelines, allowing e.g., stereo rendering or consistent insertion of faces in synthetic 3D environments. Our solution proposes the first truly free-viewpoint rendering of realistic faces at interactive rates, using only a small number of casual photos as input, while simultaneously allowing semantic editing capabilities, such as facial expression or lighting changes.

show abstract

StyleRig: Rigging StyleGAN for 3D Control Over Portrait Images

Cited by 376 publications

References 24 publications

DeepFaceEditing

DeepFaceEditing

Learning Non-Linear Disentangled Editing For Stylegan

FreeStyleGAN

Contact Info

Product

Resources

About