Interferometric Synthetic Aperture Radar (InSAR) is a widely used remote sensing technology for Earth observation, enabling the detection and measurement of ground deformation through the generation of interferograms. However, phase noise remains a critical factor that degrades interferogram quality. To address this issue, this study proposes MOMFNet, a deep learning approach for InSAR phase filtering based on multi-objective multi-kernel feature extraction that leverages multi-objective multi-kernel feature extraction. MOMFNet incorporates a multi-objective loss function that accounts for both the spatial and statistical characteristics of the denoising results, while its multi-kernel convolutional feature extraction module captures multi-scale information comprehensively. Furthermore, the introduction of weighted residual blocks allows the model to adaptively adjust the importance of features, improving its ability to accurately identify and suppress noise. To train the MOMFNet network, we developed an interferogram simulation strategy that uses randomly distorted 2D Gaussian surfaces to simulate terrain variations, Perlin noise to model atmospheric turbulence phases, and negative Gaussian noise to generate random training samples at different noise levels. Comparative experiments with traditional denoising methods and other deep learning approaches, through both qualitative and quantitative analyses, demonstrated that MOMFNet excels in noise suppression and phase recovery, particularly in scenarios involving large gradients and random noise. Empirical studies using Sentinel-1 satellite data from the Yanzhou coal mine validated the practical value of MOMFNet, showing that it effectively removes irrelevant noise while preserving critical phase details, significantly improving interferogram quality. This research provides important insights into the application of deep learning for InSAR denoising.