Compared with traditional mine detection methods, UAV-based measures are more suitable for the rapid detection of large areas of scatterable landmines, and a multispectral fusion strategy based on a deep learning model is proposed to facilitate mine detection. Using the UAV-borne multispectral cruise platform, we establish a multispectral dataset of scatterable mines, with mine-spreading areas of the ground vegetation considered. In order to achieve the robust detection of occluded landmines, first, we employ an active learning strategy to refine the labeling of the multispectral dataset. Then, we propose an image fusion architecture driven by detection, in which we use YOLOv5 for the detection part, to improve the detection performance instructively while enhancing the quality of the fused image. Specifically, a simple and lightweight fusion network is designed to sufficiently aggregate texture details and semantic information of the source images and obtain a higher fusion speed. Moreover, we leverage detection loss as well as a joint-training algorithm to allow the semantic information to dynamically flow back into the fusion network. Extensive qualitative and quantitative experiments demonstrate that the detection-driven fusion (DDF) that we propose can effectively increase the recall rate, especially for occluded landmines, and verify the feasibility of multispectral data through reasonable processing.