Visible and infrared image fusion (VIF) for obtaining foreground salient information has strong application potential and made substantial progress based on deep neural networks. However, it remains difficult to resolve the feature degradation and spatial detail loss in the feed-forward process of the existing deep networks. In this paper, we propose an input modality-independent feature analysis-reconstruction fusion network to solve the above problems. In the feature extraction stage, a feed-forward feature enhancement module (DFEM) is embedded to explicitly enhance the infrared and visible modal salient features, respectively.Also, an attention template based on global correlation is constructed for converging different channel feature mappings to obtain a consistent fusion representation. Afterwards,dynamic convolution is used to adaptively construct a convolutional kernels in terms of the current input to generate the fused image.Additionally , a perceptual loss function is added into the encoder training to further preserve the semantic information in the fused features for reference-free image scenarios. Subjective and multiple objective evaluations Additionally,using the TNO and RoadScene datasets show that the proposed method outperforms existing fusion baseline models, with the greater average measurements of EN, MI, QAB/F and SCD. Moreover, the fusion results maintain the visible background texture as well as the infrared salient target contrast better.