Image fusion is an effective complementary method to obtain information from multi-source data. In particular, the fusion of synthetic aperture radar (SAR) and panchromatic images contributes to the better visual perception of objects and compensates for spatial information. However, conventional fusion methods fail to address the differences in imaging mechanism and, therefore, they cannot fully consider all information. Thus, this paper proposes a novel fusion method that both considers the differences in imaging mechanisms and sufficiently provides spatial information. The proposed method is learning-based; it first selects data to be used for learning. Then, to reduce the complexity, classification is performed on the stacked image, and the learning is performed independently for each class. Subsequently, to consider sufficient information, various features are extracted from the SAR image. Learning is performed based on the model’s ability to establish non-linear relationships, minimizing the differences in imaging mechanisms. It uses a representative non-linear regression model, random forest regression. Finally, the performance of the proposed method is evaluated by comparison with conventional methods. The experimental results show that the proposed method is superior in terms of visual and quantitative aspects, thus verifying its applicability.