The presence of pollutants in the air has a direct impact on our health and causes detrimental changes to our environment. Air quality monitoring is therefore of paramount importance. The high cost of the acquisition and maintenance of accurate air quality stations implies that only a small number of these stations can be deployed in a country. To improve the spatial resolution of the air monitoring process, an interesting idea is to develop data-driven models to predict air quality based on readily available data. In this paper, we investigate the correlations between air pollutants concentrations and meteorological and road traffic data. Using machine learning, regression models are developed to predict pollutants concentration. Both linear and non-linear models are investigated in this paper. It is shown that non-linear models, namely Random Forest (RF) and Support Vector Regression (SVR), better describe the impact of traffic flows and meteorology on the concentrations of pollutants in the atmosphere. It is also shown that more accurate prediction models can be obtained when including some pollutants’ concentration as predictors. This may be used to infer the concentrations of some pollutants using those of other pollutants, thereby reducing the number of air pollution sensors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.