Three-dimensional (3D) surface models, e.g., digital elevation models (DEMs), are important for planetary exploration missions and scientific research. Current DEMs of the Martian surface are mainly generated by laser altimetry or photogrammetry, which have respective limitations. Laser altimetry cannot produce high-resolution DEMs; photogrammetry requires stereo images, but high-resolution stereo images of Mars are rare. An alternative is the convolutional neural network (CNN) technique, which implicitly learns features by assigning corresponding inputs and outputs. In recent years, CNNs have exhibited promising performance in the 3D reconstruction of close-range scenes. In this paper, we present a CNN-based algorithm that is capable of generating DEMs from single images; the DEMs have the same resolutions as the input images. An existing low-resolution DEM is used to provide global information. Synthetic and real data, including context camera (CTX) images and DEMs from stereo High-Resolution Imaging Science Experiment (HiRISE) images, are used as training data. The performance of the proposed method is evaluated using single CTX images of representative landforms on Mars, and the generated DEMs are compared with those obtained from stereo HiRISE images. The experimental results show promising performance of the proposed method. The topographic details are well reconstructed, and the geometric accuracies achieve root-mean-square error (RMSE) values ranging from 2.1 m to 12.2 m (approximately 0.5 to 2 pixels in the image space). The experimental results show that the proposed CNN-based method has great potential for 3D surface reconstruction in planetary applications.