Currently, most feature extraction methods lack generality and require specific feature extraction methods for images from different sensors, particularly in the detection of faults in power equipment. However, the challenge lies in how to restore realistic texture details while correcting the color distortion. To address this issue, we propose an innovative infrared-visible light image fusion technique that combines hierarchical attention modules and collaborative refinement modules to facilitate feature fusion by jointly preserving intricate details and correcting lighting conditions. The hierarchical attention module aims to provide two different attention weight maps, which help select the most salient information from the source images in the infrared and visible light domains. This ultimately produces intermediate fusion results with comprehensive and complementary data. The collaborative refinement module consists of an edge enhancement network and a lighting correction network, which works together to enhance edge details and correct color differences. Experimental results validate the effectiveness of this method in successfully fusing the two types of images. The results demonstrate that, compared with several mainstream fusion methods, this method exhibits significant advantages in publicly available datasets and scenarios, such as power equipment detection.