Low spatial resolution is a well-known problem for depth maps captured by low-cost consumer depth cameras. Depth map super-resolution (SR) can be used to enhance the resolution and improve the quality of depth maps. In this paper, we propose a recumbent Y network (RYNet) to integrate the depth information and intensity information for depth map SR. Specifically, we introduce two weight-shared encoders to respectively learn multi-scale depth and intensity features, and a single decoder to gradually fuse depth information and intensity information for reconstruction. We also design a residual channel attention based atrous spatial pyramid pooling structure to further enrich the feature's scale diversity and exploit the correlations between multi-scale feature channels. Furthermore, the violations of co-occurrence assumption between depth discontinuities and intensity edges will generate texture-transfer and depth-bleeding artifacts. Thus, we propose a spatial attention mechanism to mitigate the artifacts by adaptively learning the spatial relevance between intensity features and depth features and reweighting the intensity features before fusion. Experimental results demonstrate the superiority of the proposed RYNet over several state-of-the-art depth map SR methods. INDEX TERMS Depth map super-resolution, convolutional neural network, UNet network, atrous spatial pyramid pooling, attention mechanism.