Among the critical global crises curbing world development and sustainability, air quality degradation has been a long-lasting and increasingly urgent one and it has been sufficiently proven to pose severe threats to human health and social welfare. A higher level of model prediction accuracy can play a fundamental role in air quality assessment and enhancing human well-being. In this paper, four types of machine learning models—random forest model, ridge regression model, support vector machine model, extremely randomized trees model—were adopted to predict PM2.5 concentration in ten cities in the Jing-Jin-Ji region of north China based on multi-sources spatiotemporal data including air quality and meteorological data in time series. Data were fed into the model by using the rolling prediction method which is proven to improve prediction accuracy in our experiments. Lastly, the comparative experiments show that at the city level, RF and ExtraTrees models have better predictive results with lower mean absolute error (MAE), root mean square error (RMSE), and higher index of agreement (IA) compared to other selected models. For seasonality, level four models all have the best prediction performances in winter time and the worst in summer time, and RF models have the best prediction performance with the IA ranging from 0.93 to 0.98 with an MAE of 5.91 to 11.68 μg/m3. Consequently, the demonstration of how each model performs differently in each city and each season is expected to shed light on environmental policy implications.