A system for determination the distance from the robot to the scene is useful for object tracking, and 3-D reconstruction may be desired for many manufacturing and robotic tasks. While the robot is processing materials, such as welding parts, milling, drilling, fragments of materials fall on the camera installed on the robot, introducing unnecessary information when building a depth map, as well as the emergence of new lost areas, which leads to incorrect determination of the size of objects. There is a problem comprising a decrease in the accuracy of planning the movement trajectory caused by incorrect sections on the depth map because of incorrect distance determination of objects. We present a two-stage approach combining defect detection and depth reconstruction algorithms. The first step is image defects detection based on convolutional auto-encoder (U-Net) and deep feature fusion network (DFFN-Net). The second step is a depth map reconstruction with the exemplar-based and the anisotropic gradient concepts. The proposed modified block fusion algorithm uses a local image descriptor obtained by an automatic encoder for image reconstruction, which extracts image features and depth maps using a decoding network. Our technique outperforms the state-of-the-art methods quantitatively in reconstruction accuracy on RGB-D benchmark for evaluating manufacturing vision systems.