Visual saliency detection, toward the simulation of human visual system (HVS), has drawn much attention in recent decades. Reconstruction based saliency detection models are established for saliency detection, which predict unexpected regions via linear combination or auto-encoder network. However, these models are ineffective in dealing with images due to the loss of spatial information caused by the conversion from images to vectors. In this paper, a novel approach is proposed to solve this problem. The core is a deep reconstruction model, i.e., convolutional neural network for reconstruction stacked with auto-encoder (CN-NR). On the one hand, the use of CNN is able to directly take two-dimensional data as input instead of having to convert the matrix to a series of vectors as in conventional reconstruction based saliency detection methods. On the other