Depth measurement methods based on structured light are popular due to their advantages of low cost, good portability and ease of implementation. Depth information for the object is obtained based on the geometric relationship of the imaging system and triangulation theory, which usually requires local stereo matching operations. However, this is computationally intensive, resulting in reduced depth accuracy and worse depth maps. To address these problems, this paper proposes a novel depth measurement method based on a convolutional neural network (DMCNN), which is cast as a pixel-wise classification–regression task without matching. Firstly, the DMCNN network is designed as an encoder–decoder structure. A feature pyramid is adopted in the encoder to extract multi-scale fusion features, and parallel classification and regression branches are constructed at the end of the decoder to achieve depth prediction from coarse to fine. Secondly, we use a four-step phase shift algorithm to generate ground truth depth maps and build a dataset containing a large number of speckle distortion images and their corresponding depth maps to train our network. The algorithm runs on an RTX 2080Ti graphics processing unit (GPU) using 20 000 training images. Experimental results show that our method can achieve higher accuracy than alternative depth measurement methods.