“…In particular, the approaches with implicit function [50,52,73] reconstruct clothed humans with fine geometry details but are restricted to only human without modeling human-object interactions. For photorealistic human performance rendering, various data representation have been explored, such as point-clouds [42,64], voxels [30], implicit representations [36,47,48,62] or hybrid neural texturing [57]. However, existing solutions rely on doom-level dense RGB sensors or are limited to human priors without considering the joint rendering of human-object interactions.…”