To improve the accuracy of atmospheric visibility (V) prediction based on machine learning in different pollution scenarios, a new atmospheric visibility prediction method based on the stacking fusion model (VSFM) is established in this paper. The new method uses the stacking strategy to fuse two base learners—eXtreme gradient boosting (XGBoost) and light gradient boosting machine (LightGBM)—to optimize prediction accuracy. Furthermore, seasonal feature importance evaluations and feature selection were utilized to optimize prediction accuracy in different seasons with different pollution sources. The new VSFM was applied to 1-year environmental and meteorological data measured in Qingdao, China. Compared to other traditional non-stacking models, the new VSFM improved precision during different seasons, especially in extremely low-visibility scenarios (V< 2 km). The TS score of the VSFM was significantly better than that of other models. For extremely low-visibility scenarios, the VSFM had a threat score (TS) of 0.5, while the best performance of other models was less than 0.27. The new method is promising for atmospheric visibility prediction under complex urban pollution conditions. The research results can also improve our understanding of the factors that influence urban visibility.
Since there are many possible influencing factors of visibility, lightweight data requirements in practical applications of machine learning in visibility prediction can reduce the corresponding data observation cost and collection difficulty. By using the long-term measured data in Qingdao, this research comprehensively compares the performance of five common machine learning methods under different training parameter schemes, including XGBoost, LightGBM, Random Forest (RF), Support Vector Machine (SVM) and Multiple Linear Regression (MLR). The lightweight visibility prediction schemes based on pollutant parameter optimization are established. The seasonal training data of five machine learning models is preprocessed, and then performance evaluations of predictions are carried out. The analysis results show that in terms of models, ensemble learning models, including XGBoost, LightGBM, and RF, have significantly better seasonal visibility prediction effects than SVM and MLR models; XGBoost and LightGBM also have slightly better prediction effects than RF models. In terms of pollutant parameters, solid pollutants have a greater impact on visibility prediction than gaseous pollutants; PM2.5 is more influential than PM10 in visibility prediction. The visibility prediction scheme with six parameters using meteorological parameters and PM2.5 based on XGBoost or LightGBM model is preferably established in this research. This scheme can achieve the same prediction performance as the 11 parameter prediction scheme. The Correlation Coefficient (CC) of the results is around 0.85. The results of this study can not only be used to provide a machine learning scheme reference for practical visibility prediction applications, but also help to deepen the understanding of the factors affecting visibility.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.