Multimodal medical image fusion is a fundamental but challenging problem in the fields of brain science research and brain disease diagnosis, and it is challenging for sparse representation (SR)-based fusion to characterize activity level with single measurement and no loss of effective information. In this paper, the Kronecker-criterion-based SR framework is applied for medical image fusion with a patch-based activity level integrating salient features of multiple domains. Inspired by the formation process of vision system, the spatial saliency is characterized by textural contrast (TC), which is composed of luminance and orientation contrasts to promote more highlighted texture information to participate in the fusion process. As substitution of the conventional l1-norm-based sparse saliency, a metric of sum of sparse salient features (SSSF) is used for promoting more significant coefficients to participate in the composition of activity level measure. The designed activity level measure is verified to be more conducive to maintain the integrity and sharpness of detailed information. Various experiments on multiple groups of clinical medical images verify the effectiveness of the proposed fusion method on both visual quality and objective assessment. Furthermore, the research work of this paper is helpful for further detection and segmentation of medical images.