Abstract. Image fusion combines several images of the same scene into a fused image, which contains all important information. Multiscale transform and sparse representation can solve this problem effectively. However, due to the limited number of dictionary atoms, it is difficult to provide an accurate description for image details in the sparse representation-based image fusion method, and it needs a great deal of calculations. In addition, for the multiscale transform-based method, the low-pass subband coefficients are so hard to represent sparsely that they cannot extract significant features from images. In this paper, a nonsubsampled contourlet transform (NSCT) and sparse representation-based image fusion method (NSCTSR) is proposed. NSCT is used to perform a multiscale decomposition of source images to express the details of images, and we present a dictionary learning scheme in NSCT domain, based on which we can represent low-frequency information of the image sparsely in order to extract the salient features of images. Furthermore, it can reduce the calculation cost of the fusion algorithm with sparse representation by the way of nonoverlapping blocking. The experimental results show that the proposed method outperforms both the fusion method based on single sparse representation and multiscale decompositon. © The Authors. Published by SPIE under a Creative Commons Attribution 3.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.