While the 3D human reconstruction methods using Pixelaligned implicit function (PIFu) develop fast, we observe that the quality of reconstructed details is still not satisfactory. Flat facial surfaces frequently occur in the PIFubased reconstruction results. To this end, we propose a twoscale PIFu representation to enhance the quality of the reconstructed facial details. Specifically, we utilize two MLPs to separately represent the PIFus for the face and human body. An MLP dedicated to the reconstruction of 3D faces can increase the network capacity and reduce the difficulty of the reconstruction of facial details as in the previous onescale PIFu representation. To remedy the topology error, we leverage 3 RGBD sensors to capture multiview RGBD data as the input to the network, a sparse, lightweight capture setting. Since the depth noise severely influences the reconstruction results, we design a depth refinement module to reduce the noise of the raw depths under the guidance of the input RGB images. We also propose an adaptive fusion scheme to fuse the predicted occupancy field of the body and face to eliminate the discontinuity artifact at their boundaries. Experiments demonstrate the effectiveness of our approach in reconstructing vivid facial details and deforming body shapes, and verify its superiority over state-of-the-art methods.