Currently, accuracy of existing diesel engine fault diagnosis methods under strong noise and generalisation performance between different noise levels are still limited. A novel multi‐scale CNN‐LSTM neural network (MSCNN‐LSTMNet) is proposed with a residual‐CNN denoising module for anti‐noise diesel engine misfire diagnosis. First, a residual‐CNN module is designed for denoising the original vibration signal measured from the diesel engine cylinder and residual loss for constructing a new loss function is utilised. Considering the essential characteristics of measured vibration signals at different scales, a multi‐scale convolutional NN (CNN) block is designed to realize multi‐scale feature extraction. Specifically, multiple convolution layers with different branches and different convolution kernel sizes are utilised to extract different time scales features, enhancing the robustness of the model. On this basis, the LSTM is utilised to further extract sequential features for improving anti‐noise and generalisation performances. The effectiveness of MSCNN‐LSTMNet is validated by experimental results of both one‐ and hybrid‐cylinder diesel engine misfires diagnosis under various noise levels and working conditions. The results demonstrate that MSCNN‐LSTMNet achieved much better anti‐noise and generalisation performances than the existing methods. Under strong noise conditions (−10 dB signal‐to‐noise ratio) for four datasets, MSCNN‐LSTMNet obtained 97.561% average accuracy, while average accuracy for random forest, deep neural network, CNN and MSCNNNet were 73.828%, 79.544%, 82.247%, and 89.741%, respectively. Moreover, for 11 noise generalisation tasks between different noise levels, MSCNN‐LSTMNet obtained at least 96.679%, 97.849%, 98.892%, and 94.010% accuracy on the four datasets, which are much higher than those of the existing methods.