Monocular Real-Time Volumetric Performance Capture

Li, Ruilong; Xiu, Yuliang; Saito, Shunsuke; Huang, Zeng; Olszewski, Kyle; Li, Hao

doi:10.1007/978-3-030-58592-1_4

Cited by 71 publications

(48 citation statements)

References 91 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, SMPL [28] regression or optimization can be incorporated to generate more reliable and robust outputs as shown in [48,47]. Realtime methods can be implemented with the aid of a single depth sensor [43,44] or by innovating computation and rendering algorithms [24]. Regarding to the 3D representations used in these methods, we can split them into two categories: explicit [37,30,48] and implicit [33,34,19,20] reconstruction methods.…”

Section: Related Workmentioning

confidence: 99%

“…Benefiting from the fast improvement of deep implicit functions for 3D representations, recent methods [33,34,24] are able to recover the 3D body shape only from a single RGB image. Compared with the voxel-based [37,19,48] or mesh-based [30,1] representations, an implicit function guides the deep learning models to notice geometric details in a more efficient way.…”

Section: Introductionmentioning

confidence: 99%

“…Compared with the voxel-based [37,19,48] or mesh-based [30,1] representations, an implicit function guides the deep learning models to notice geometric details in a more efficient way. Specifically, PIFu [33,24] achieves plausible single human reconstruction using only RGB images, and PIFuHD [34] further utilizes normal maps and high resolution images to generate more detailed results.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

DeepMultiCap: Performance Capture of Multiple Characters Using Sparse Multiview Cameras

Zheng¹,

Shao²,

Zhang³

et al. 2021

Preprint

View full text Add to dashboard Cite

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

DeepMultiCap: Performance Capture of Multiple Characters Using Sparse Multiview Cameras

Zheng¹,

Shao²,

Zhang³

et al. 2021

Preprint

View full text Add to dashboard Cite

“…[23,27,38,25,15,43,26,47,31] learn to infer body pose and shape from a single image, but only consider minimally clothed human. Various methods [48,60,6,42,41,18,21,59,28,36,13] have recently been proposed to reconstruct human in clothing. BodyNet [48] and DeepHuman [60] output human shape in the form of occupancy voxel grids.…”

Section: Learning-based Approaches With Monocular Rgbmentioning

confidence: 99%

“…Such representation has difficulties to capture fine details due to the high memory footprint. Neural implicit functions have been introduced to replace an explicit voxel grid and have enabled high-fidelity reconstructions from single images [42,41,18,21,59,28]. A major limitation of these methods is the lack of generalization to unseen poses in the wild.…”

Section: Learning-based Approaches With Monocular Rgbmentioning

confidence: 99%

Human Performance Capture from Monocular Video in the Wild

Guo

Chen

Song

et al. 2021

2021 International Conference on 3D Vision (3DV)

View full text Add to dashboard Cite

Capturing the dynamically deforming 3D shape of clothed human is essential for numerous applications, including VR/AR, autonomous driving, and human-computer interaction. Existing methods either require a highly specialized capturing setup, such as expensive multi-view imaging systems, or they lack robustness to challenging body poses. In this work, we propose a method capable of capturing the dynamic 3D human shape from a monocular video featuring challenging body poses, without any additional input. We first build a 3D template human model of the subject based on a learned regression model. We then track this template model's deformation under challenging body articulations based on 2D image observations. Our method outperforms state-of-the-art methods on an in-the-wild human video dataset 3DPW. Moreover, we demonstrate its efficacy in robustness and generalizability on videos from iPER datasets.

show abstract

AvatarCap: Animatable Avatar Conditioned Monocular Human Volumetric Capture

Zheng

Zhang

et al. 2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Monocular Real-Time Volumetric Performance Capture

Cited by 71 publications

References 91 publications

DeepMultiCap: Performance Capture of Multiple Characters Using Sparse Multiview Cameras

DeepMultiCap: Performance Capture of Multiple Characters Using Sparse Multiview Cameras

Human Performance Capture from Monocular Video in the Wild

AvatarCap: Animatable Avatar Conditioned Monocular Human Volumetric Capture

Contact Info

Product

Resources

About