2020
DOI: 10.1007/978-3-030-58592-1_4
|View full text |Cite
|
Sign up to set email alerts
|

Monocular Real-Time Volumetric Performance Capture

Abstract: We present the first approach to volumetric performance capture and novel-view rendering at real-time speed from monocular video, eliminating the need for expensive multi-view systems or cumbersome pre-acquisition of a personalized template model. Our system reconstructs a fully textured 3D human from each frame by leveraging Pixel-Aligned Implicit Function (PIFu). While PIFu achieves high-resolution reconstruction in a memory-efficient manner, its computationally expensive inference prevents us from deploying… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
48
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 71 publications
(48 citation statements)
references
References 91 publications
0
48
0
Order By: Relevance
“…Moreover, SMPL [28] regression or optimization can be incorporated to generate more reliable and robust outputs as shown in [48,47]. Realtime methods can be implemented with the aid of a single depth sensor [43,44] or by innovating computation and rendering algorithms [24]. Regarding to the 3D representations used in these methods, we can split them into two categories: explicit [37,30,48] and implicit [33,34,19,20] reconstruction methods.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Moreover, SMPL [28] regression or optimization can be incorporated to generate more reliable and robust outputs as shown in [48,47]. Realtime methods can be implemented with the aid of a single depth sensor [43,44] or by innovating computation and rendering algorithms [24]. Regarding to the 3D representations used in these methods, we can split them into two categories: explicit [37,30,48] and implicit [33,34,19,20] reconstruction methods.…”
Section: Related Workmentioning
confidence: 99%
“…Benefiting from the fast improvement of deep implicit functions for 3D representations, recent methods [33,34,24] are able to recover the 3D body shape only from a single RGB image. Compared with the voxel-based [37,19,48] or mesh-based [30,1] representations, an implicit function guides the deep learning models to notice geometric details in a more efficient way.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…[23,27,38,25,15,43,26,47,31] learn to infer body pose and shape from a single image, but only consider minimally clothed human. Various methods [48,60,6,42,41,18,21,59,28,36,13] have recently been proposed to reconstruct human in clothing. BodyNet [48] and DeepHuman [60] output human shape in the form of occupancy voxel grids.…”
Section: Learning-based Approaches With Monocular Rgbmentioning
confidence: 99%
“…Such representation has difficulties to capture fine details due to the high memory footprint. Neural implicit functions have been introduced to replace an explicit voxel grid and have enabled high-fidelity reconstructions from single images [42,41,18,21,59,28]. A major limitation of these methods is the lack of generalization to unseen poses in the wild.…”
Section: Learning-based Approaches With Monocular Rgbmentioning
confidence: 99%