The fusion of a hyperspectral image (HSI) and multispectral image (MSI) can significantly improve the ability of ground target recognition and identification. The quality of spatial information and the fidelity of spectral information are normally contradictory. However, these two properties are non-negligible indicators for multi-source remote-sensing images fusion. The smoothing filter-based intensity modulation (SFIM) method is a simple yet effective model for image fusion, which can improve the spatial texture details of the image well, and maintain the spectral characteristics of the image significantly. However, traditional SFIM has a poor effect for edge information sharpening, leading to a bad overall fusion result. In order to obtain better spatial information, a spatial filter-based improved LSE-SFIM algorithm is proposed in this paper. Firstly, the least square estimation (LSE) algorithm is combined with SFIM, which can effectively improve the spatial information quality of the fused image. At the same time, in order to better maintain the spatial information, four spatial filters (mean, median, nearest and bilinear) are used for the simulated MSI image to extract fine spatial information. Six quality indexes are used to compare the performance of different algorithms, and the experimental results demonstrate that the LSE-SFIM based on bilinear (LES-SFIM-B) performs significantly better than the traditional SFIM algorithm and other spatially enhanced LSE-SFIM algorithms proposed in this paper. Furthermore, LSE-SFIM-B could also obtain similar performance compared with three state-of-the-art HSI-MSI fusion algorithms (CNMF, HySure, and FUSE), while the computing time is much shorter.