To overcome the limitations of long-term prediction of PM2.5 concentration, a multi-factor information flow causality analysis method is used to screen suitable meteorological and air pollutant-related factors and concatenate them with a PM2.5 sequence as the dataset. A modal decomposition algorithm is used as a module to be integrated into the autoformer (transformer improved with autocorrelation mechanism) model to improve it, and the modal autoformer (empirical modal decomposition combined with autoformer) is proposed. The constructed model decomposes the sequence into several components by using the modal decomposition module and uses the self-correlation mechanism and decomposition structure to decompose and extract features of different components at the time-feature level. Based on the matching method, the model is adjusted for different component features to improve the long-term prediction effect. The model is applied to three cities in Henan Province, Zhengzhou, Luoyang, and Zhumadian, as examples for experiments, and gated neural unit (GRU), informer, autoformer, and modal GRU (empirical modal decomposition combined with GRU model) are constructed for comparative verification. The results show that the modal autoformer can better cope with the complex characteristics of long-term prediction of the PM2.5 time series, has strong spatial adaptability and that its various indicators are optimal for the three cities, with R2 values being all above 0.96, where the highest is 0.987 in Zhengzhou; MAPE (Mean absolute percentage error) values all being less than 10, where the best is 7.602 in Zhumadian; and MAE (Mean absolute error) values all being less than 4. The prediction effect is stable enough, showing its feasibility and adaptability in long-term prediction.