A fast and robust template matching scheme, called Matching by Slice Transform Matrix Mapping (MSTMM), is proposed for the matching difficulty under occluded scenarios caused by nonlinear intensity differences and structure differences between visible and thermal infrared images. The first step in the MSTMM scheme was to extract information about the distribution of pixels with the same gray level through the developed Expanded Slice Transform with Adaptive Gray Level (EST-AGL). After completing the construction of the EST-AGL matrix for all image patches, different EST-AGL matrices were mapped to different integers or floats through the traditional special integer mapping mechanism or the neural network mapping mechanism. Finally, template matching between visible and thermal infrared images was achieved by evaluating the similarity of correlation mapping surface images through the Normalized Cross Correlation (NCC) algorithm. The proposed EST-AGL method can overcome the nonlinear intensity differences between visible and thermal infrared images by extracting the structural features of the image. The mapping mechanism of the MSTMM scheme can reduce the structural differences between the normal template image and the query image under an occluded scenario by increasing the similarity between the normal image patches and the image patches with occlusion. The proper mapping mechanism ensures the high performance of the MSTMM scheme by using only the simple NCC algorithm instead of other time-consuming anti-occlusion dense feature algorithms in the similarity evaluation stage. The three main experimental results of the MSTMM scheme are as follows: (1) the scheme of MSTMM can achieve template matching in only 0.015 seconds when a 64 × 64 template image slides on a 256 × 256 query image on a hardware platform with limited resources; (2) the matching success rate of the MSTMM scheme can reach up to 75% among 2107 experimental samples; and (3) the neural network training in the neural network mapping mechanism only takes at least 104.4 seconds on the CPU.INDEX TERMS Template matching, multimodal image, heterogeneous image, multisource image, visible and infrared image, image matching, neural network.