Wavefront encoding (WFC) is a depth-of-field (DOF) extension technology that combines optical encoding and digital decoding. The system extends DOF at the expense of intermediate image quality and then decodes it through an image restoration algorithm to obtain a clear image. Affected by point spread differences, traditional decoding methods are often accompanied by artifacts and noise amplification problems. In this paper, based on lens-combined modulated wavefront coding (LM-WFC), we simulate the imaging process under different object distances, generate a simulation data set of WFC, and train a multi-scale convolutional neural network. The simulation experiment proves that this method can effectively reduce artifacts and improve image clarity. In addition, we used the LM-WFC camera to obtain real scene images with different target distances for experiments. The decoding results showed that the network model can enhance the quality of image restoration and generate clear images that are more in line with human vision, which is conducive to the improvement and practical application of wavefront coding systems.