“…There were some other approaches in which various cues were used for building sufficient loss function to train the network including the mesh [31], the texture [44], the multi-view images [34], the optimized SMPL model [30] and the video [27,29]. In order to model the detailed appearance, some method attempt to refine the regressed SMPL model to obtain the detailed 3D model [1,3,23,32,42,53,61,62]. In [1,3,32], after estimating the pose and shape of SMPL model, the authors used shape from shading and texture translation to add the details to SMPL like face, hairstyle, and clothes with garment wrinkles.…”