Accurate prediction of air quality change is essential for air pollution control and human daily mobility. Due to the strong spatial and temporal correlation of air quality changes, existing air quality prediction methods often face the problem of low prediction accuracy due to insufficient extraction of spatio-temporal features. In this paper, we proposed a self-tuning spatio-temporal neural network (ST2NN) to enhance air quality prediction. ST2NN model consisted of four modules. First, ST2NN model constructed a temporal feature extraction module and a spatial feature extraction module based on gated recurrent unit (GRU) and graph convolutional neural network (GCN), respectively, and the two feature extraction modules adopted a parallel structure, which could effectively extract the spatio-temporal features in data. Additionally, ST2NN model constructed a feature fusion module based on gating mechanism to delineate the contribution of spatio-temporal features to the predicted values. Further, ST2NN model constructed a Hyperband hyperparameter optimization module based on Hyperband optimization algorithm to automatically adjust the network hyperparameters. The structure of ST2NN model endowed it with excellent spatio-temporal feature extraction and parameter adaptability. ST2NN model was evaluated and compared with existing models, including convolutional long short-term memory neural network (ConvLSTM), GRU, combined convolutional neural network and long short-term memory neural network (CNN-LSTM), and GCN-LSTM for air quality index (AQI) prediction using data from twelve monitoring stations in Beijing, China. Across all four evaluation indexes, ST2NN model outperformed the comparative models, improving prediction accuracy by 0.51%-10.18% (measured using $${R}^{2}$$
R
2
). From the experimental results, it can be seen that ST2NN model constructed from the perspective of spatio-temporal feature extraction has better prediction performance compared with the existing air quality prediction model, which provides a new method for air quality prediction and has certain application value.