Air pollution is an issue across the world. It not only directly affects the environment and human health, but also influences the regional and even global climate by changing the atmospheric radiation budget, resulting in extensive and serious adverse effects. It is of great significance to accurately predict the concentration of pollutant. In this study, the domain knowledge of Atmospheric Sciences, advanced deep learning methods and big data are skillfully combined to establish a novel integrated model TSTM, derived from its fundamental features of Time, Space, Type and Meteorology, to achieve regional and multistep air quality forecast. Firstly, Expectation Maximization and Min-Max algorithms are used for the interpolation and normalization of data. Secondly, feature selection and construction are accomplished based on domain knowledge and correlation coefficient, and then Sliding Time Window algorithm is employed to build the supervised learning task. Thirdly, the features of pollution source and meteorological condition are learned and predicted by CNN-BiLSTM-Attention model, the integrated model of convolutional neural network and Bidirectional long short-term memory network based on Sequence to Sequence framework with Attention mechanism, and then Convolutional Long Short-Term Memory Neural Network (ConvLSTM) integrates the two determinant features to obtain predicted pollutant concentration. The multiple-output strategy is also employed for the multistep prediction. Lastly, the forecast performance of TSTM for pollutant concentration, air quality and heavy pollution weather is tested systematically. Experiments are conducted in Beijing-Tianjin-Hebei Air Pollution Transmission Channel (“2+26” cities) of China for multistep prediction of hourly concentration of six conventional air pollutants. The results show that the performance of TSTM is better than other benchmark models especially for heavy pollution weather and it has good robustness and generalization ability.