In this paper, we propose the novel 3D reconstruction framework, where the surface of a target object is reconstructed accurately and robustly from multi-view depth maps. A depth map of a moving object tends to have the spatiallyvarying perspective warps due to motion blur and rolling shutter artifacts. Incorporating those misaligned points from the views into the world coordinate leads to significant artifacts in the reconstructed shape. We address the mismatches by the patch-based depth-to-surface alignment using implicit surface-based distance measurement. The patch-based minimization finds spatial warps on the depth map fast and accurately with the global transformation preserved. The proposed framework efficiently optimizes the local alignments against depth occlusions and local variants thanks to the point to surface distance based on an implicit representation. The proposed method shows significant improvements over the other reconstruction methods, demonstrating efficiency and benefits of our method in the multi-view reconstruction.