Forecasting of air pollutant concentration, which is influenced by air pollution accumulation, traffic flow and industrial emissions, has attracted extensive attention for decades. In this paper, we propose a spatio-temporal attention convolutional long short term memory neural networks (Attention-CNN-LSTM) for air pollutant concentration forecasting. Firstly, we analyze the Granger causalities between different stations and establish a hyperparametric Gaussian vector weight function to determine spatial autocorrelation variables, which is used as part of the input feature. Secondly, convolutional neural networks (CNN) is employed to extract the temporal dependence and spatial correlation of the input, while feature maps and channels are weighted by attention mechanism, so as to improve the effectiveness of the features. Finally, a depth long short term memory (LSTM) based time series predictor is established for learning the long-term and short-term dependence of pollutant concentration. In order to reduce the effect of diverse complex factors on LSTM, inherent features are extracted from historical air pollutant concentration data meteorological data and timestamp information are incorporated into the proposed model. Extensive experiments were performed using the Attention-CNNLSTM, autoregressive integrated moving average (ARIMA), support vector regression (SVR), traditional LSTM and CNN, respectively. The results demonstrated that the feasibility and practicability of Attention-CNN-LSTM on estimating CO and NO concentration.