2021
DOI: 10.48550/arxiv.2101.02697
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

PVA: Pixel-aligned Volumetric Avatars

Amit Raj,
Michael Zollhoefer,
Tomas Simon
et al.

Abstract: Figure 1: We present a novel approach for the prediction of volumetric avatars of human heads from a small number of example views. Our model enables view synthesis for unseen identities and is able to generate faithful facial expressions.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(12 citation statements)
references
References 21 publications
0
12
0
Order By: Relevance
“…Baselines. Among the recent generalizable NeRF methods [36,52,47], we compare with Pixel-NeRF [52] and PVA [36] which focus on very sparse (up to 3 or 4) input views. we reimplement [36] since it is not open-sourced.…”
Section: Comparison With Generalizable Nerf Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…Baselines. Among the recent generalizable NeRF methods [36,52,47], we compare with Pixel-NeRF [52] and PVA [36] which focus on very sparse (up to 3 or 4) input views. we reimplement [36] since it is not open-sourced.…”
Section: Comparison With Generalizable Nerf Methodsmentioning
confidence: 99%
“…Despite the promising results, these general NeRF [19,53] and human-specific NeRF [13,32,33,35,50] methods must be optimized for each new video separately, and generalize poorly on unseen scenarios. Generalizable NeRFs [36,47,52] try to avoid the expensive per-scene optimization by imageconditioning using pixel-aligned features. However, directly extending such methods to model complex and dynamic 3D humans is not straightforward when available observations are highly sparse.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Classic 3D reconstruction uses SfM pipelines from multiple views of a rigid scene, with pipelines such as KinectFusion [35] and Dynam-icFusion [36] integrating depth sensors for 3D reconstruction of dense static and deformable surfaces. NeRF [34] and its deformable extensions [44,7,55,46,40] such as D-NeRF [45] synthesize novel views from densely sampled multi-views of a static or mildly dynamic scene using a Neural Radiance Field, from which explicit 3D geometry can be further extracted. A more recent work, LASR [63], optimizes a single 3D deformable model on an individual video sequence.…”
Section: Related Workmentioning
confidence: 99%