As the mechanization of the CBM extraction process advances and geological conditions continuously evolve, the production data from CBM wells is deviating increasingly from linearity, thereby presenting a significant challenge in accurately predicting future gas production from these wells. When it comes to predicting the production of CBM, a single deep-learning model can face several drawbacks such as overfitting, gradient explosion, and gradient disappearance. These issues can ultimately result in insufficient prediction accuracy, making it important to carefully consider the limitations of any given model. It’s impressive to see how advanced technology can enhance the prediction accuracy of CBM. In this paper, the use of a CNN model to extract features from CBM well data and combine it with Bi-LSTM and a Multi-Head Attention mechanism to construct a production prediction model for CBM wells—the CNN-BL-MHA model—is fascinating. It is even more exciting that predictions of gas production for experimental wells can be conducted using production data from Wells W1 and W2 as the model’s database. We compared and analyzed the prediction results obtained from the CNN-BL-MHA model we constructed with those from single models like ARIMA, LSTM, MLP, and GRU. The results show that the CNN-BL-MHA model proposed in the study has shown promising results in improving the accuracy of gas production prediction for CBM wells. It’s also impressive that this model demonstrated super stability, which is essential for reliable predictions. Compared to the single deep learning model used in this study, its prediction accuracy can be improved up to 35%, and the prediction results match the actual yield data with lower error.