Vehicle interior noise has emerged as a crucial assessment criterion for automotive NVH (Noise, Vibration, and Harshness). When analyzing the NVH performance of the vehicle body, the traditional SEA (Statistical Energy Analysis) simulation technology is usually limited by the accuracy of the material parameters obtained during the acoustic package modeling and the limitations of the application conditions. In order to effectively solve these shortcomings, based on the analysis of the vehicle noise transmission path, a multi-level objective decomposition architecture of the interior noise at the driver's right ear is established. Combined with the data-driven method, the ResNet neural network model is introduced. The stacked residual blocks avoid the problem of gradient disappearance caused by the increasing network level of the traditional CNN network, thus establishing a higherprecision prediction model. This method alleviates the inherent limitations of traditional SEA simulation design, and enhances the prediction performance of the ResNet model by dynamically adjusting the learning rate. Finally, the proposed method is applied to a specific vehicle model and verified. The results show that the proposed method has significant advantages in prediction accuracy and robustness.