Addressing common challenges such as limited indicators, poor adaptability, and imprecise modeling in gas pre-warning systems for driving faces, this study proposes a hybrid predictive and pre-warning model grounded in time-series analysis. The aim is to tackle the effects of broad application across diverse mines and insufficient data on warning accuracy. Firstly, we introduce an adaptive normalization (AN) model for standardizing gas sequence data, prioritizing recent information to better capture the time-series characteristics of gas readings. Coupled with the Gated Recurrent Unit (GRU) model, AN demonstrates superior forecasting performance compared to other standardization techniques. Next, Ensemble Empirical Mode Decomposition (EEMD) is used for feature extraction, guiding the selection of the Variational Mode Decomposition (VMD) order. Minimal decomposition errors validate the efficacy of this approach. Furthermore, enhancements to the transformer framework are made to manage non-linearities, overcome gradient vanishing, and effectively analyze long time-series sequences. To boost versatility across different mining scenarios, the Optuna framework facilitates multiparameter optimization, with xgbRegressor employed for accurate error assessment. Predictive outputs are benchmarked against Recurrent Neural Networks (RNN), GRU, Long Short-Term Memory (LSTM), and Bidirectional LSTM (BiLSTM), where the hybrid model achieves an R-squared value of 0.980975 and a Mean Absolute Error (MAE) of 0.000149, highlighting its top performance. To cope with data scarcity, bootstrapping is applied to estimate the confidence intervals of the hybrid model. Dimensional analysis aids in creating real-time, relative gas emission metrics, while persistent anomaly detection monitors sudden time-series spikes, enabling unsupervised early alerts for gas bursts. This model demonstrates strong predictive prowess and effective pre-warning capabilities, offering technological reinforcement for advancing intelligent coal mine operations.