“…Some methods leveraged the additional temporal information to perform novel-view synthesis from a single video of a moving camera instead of large collections of multi-view images [12,16,28,44,46,56,65,69]. Among these, the reconstruction of humans also gained increasing interest where morphable [16] and implicit generative models [69], pre-trained features [59], or deformation fields [44,65] were employed to regularize the reconstruction. Furthermore, TöRF [1] used time-of-flight sensor measurements as an additional source of information and DyNeRF [27] learned time-dependent latent codes to constrain the radiance field.…”