In this paper we propose a perceptual quality evaluation method for image fusion which is based on human visual system (HVS) models. Our method assesses the image quality of a fused image using the following steps. First the source and fused images are filtered by a contrast sensitivity function (CSF) after which a local contrast map is computed for each image. Second, a contrast preservation map is generated to describe the relationship between the fused image and each source image. Finally, the preservation maps are weighted by a saliency map to obtain an overall quality map. The mean of the quality map indicates the quality for the fused image. Experimental results compare the predictions made by our algorithm with human perceptual evaluations for several different parameter settings in our algorithm. For some specific parameter settings, we find our algorithm provides better predictions, which are more closely matched to human perceptual evaluations, than the existing algorithms.