2020
DOI: 10.48550/arxiv.2012.12247
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Dynamic Scene From Monocular Video

Abstract: In this tech report, we present the current state of our ongoing work on reconstructing Neural Radiance Fields (NERF) of general non-rigid scenes via ray bending. Non-rigid NeRF (NR-NeRF) takes RGB images of a deforming object (e.g., from a monocular video) as input and then learns a geometry and appearance representation that not only allows to reconstruct the input sequence but also to re-render any time step into novel camera views with high fidelity. In particular, we show that a consumer-grade camera is s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
29
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 16 publications
(29 citation statements)
references
References 53 publications
0
29
0
Order By: Relevance
“…However, these neural representations above can only handle static scenes, and the literature on dynamic scene neural representation remains sparse. Recent work Ost et al 2020;Pumarola et al 2020;Rebain et al 2020;Tretschk et al 2020;Xian et al 2020] extend the approach NeRF [Mildenhall et al 2020a] using neural radiance field into the dynamic setting. They decompose the task into learning a spatial mapping from the canonical scene to the current scene at each time step and regressing the canonical radiance field.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…However, these neural representations above can only handle static scenes, and the literature on dynamic scene neural representation remains sparse. Recent work Ost et al 2020;Pumarola et al 2020;Rebain et al 2020;Tretschk et al 2020;Xian et al 2020] extend the approach NeRF [Mildenhall et al 2020a] using neural radiance field into the dynamic setting. They decompose the task into learning a spatial mapping from the canonical scene to the current scene at each time step and regressing the canonical radiance field.…”
Section: Related Workmentioning
confidence: 99%
“…Such datadriven approaches get rid of the heavy reliance on reconstruction accuracy or the extremely dense capture setting. Recent work [Ost et al 2020;Rebain et al 2020;Tretschk et al 2020] extend the NeRF approach [Mildenhall et al 2020a] into the dynamic setting. However, the above solutions for dynamic scene free-viewpoint synthesis still suffer from limited capture volume or fragile human motions.…”
Section: Introductionmentioning
confidence: 99%
“…For neural modeling and rendering of dynamic scenes, NHR [64] embeds spatial features into sparse dynamic point-clouds, Neural Volumes [30] transforms input images into a 3D volume representation by a VAE network. More recently, [26,44,47,48,61,65,74] extend neural radiance field (NeRF) [36] into the dynamic setting. They learn a spatial mapping from the canonical scene to the current scene at each time step and regress the canonical radiance field.…”
Section: Blended Imagementioning
confidence: 99%
“…In particular, the approaches with implicit function [50,52,73] reconstruct clothed humans with fine geometry details but are restricted to only human without modeling human-object interactions. For photorealistic human performance rendering, various data representation have been explored, such as point-clouds [42,64], voxels [30], implicit representations [36,47,48,62] or hybrid neural texturing [57]. However, existing solutions rely on doom-level dense RGB sensors or are limited to human priors without considering the joint rendering of human-object interactions.…”
Section: Introductionmentioning
confidence: 99%
“…But they are still restricted by the limited mesh resolution and fail to provide photo-realistic bullet-time effects. The recent neural rendering techniques [24,28,34,41,42,46] bring huge potential for neural human free-viewpoint rendering from multiple RGB input. However, these solutions rely on per-scene training or are hard to achieve real-time performance due to the heavy network.…”
Section: Introductionmentioning
confidence: 99%