Application of imputation methods for missing values of PM<sub>10</sub> and O<sub>3</sub> data: Interpolation, moving average and K-nearest neighbor methods

Saeipourdizaj, Parisa; Sarbakhsh, Parvin; Gholampour, Akbar

doi:10.34172/ehem.2021.25

Cited by 27 publications

(8 citation statements)

References 50 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The bias of multivariate estimations like correlation or regression coefficients is one of the drawbacks of the linear moving average imputation method. In general, there is no link between the values imputed by the mean of the variables and the other variables 42 …”

Section: Methodsmentioning

confidence: 99%

Prediction of daily fine particulate matter (PM_2.5) concentration in Aksaray, Turkey: Temporal variation, meteorological dependence, and employing artificial neural network

Koçak

2024

Env Prog and Sustain Energy

View full text Add to dashboard Cite

This study analyzed the temporal variation and prediction of fine particulate matter (PM2.5) concentrations in Aksaray, Turkey, a city in Central Anatolia. The relationship between PM2.5 and meteorological parameters such as temperature, humidity, wind speed, and wind direction was investigated. An artificial neural network (ANN) model was developed to predict PM2.5 levels based on meteorological data and air pollutant information. Seasonal and diurnal patterns of PM2.5 concentrations were observed, with higher values recorded during the winter and lower values during the summer. Additionally, higher levels were observed in the morning and evening, while lower levels were recorded in the afternoon. The variations in meteorological parameters, especially temperature and wind speed, significantly influenced PM2.5 levels. To predict hourly PM2.5 concentrations, single and multiple data imputation techniques were employed in combination with resilient back‐propagation (RPROP‐ANN). The neural network was applied, consisting of one input layer comprising 11 parameters, one hidden layer with 20 neurons, and an output layer. The results indicate that the best forecasting performance for PM2.5 was demonstrated by the combination of the missForest imputation technique with the RPROP neural network, as assessed by the coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE). The proposed model is characterized by a low RMSE of 5.94 and a high R2 value of 0.88, demonstrating exceptional predictive performance in air quality.

show abstract

Section: Methodsmentioning

confidence: 99%

Prediction of daily fine particulate matter (PM_2.5) concentration in Aksaray, Turkey: Temporal variation, meteorological dependence, and employing artificial neural network

Koçak

2024

Env Prog and Sustain Energy

View full text Add to dashboard Cite

show abstract

“…In contrast, machine learning and deep learning approaches often yield superior imputation results but typically necessitate extended imputation durations compared to statistical methods. Concurrently, traditional machine learning approaches, encompassing K-Nearest Neighbor, fuzzy methods, decision trees, support vectors, and other models, have been integrated into the repertoire of techniques for addressing missing values [29][30][31]. A case in point is the work of Honghai et al, where Support Vector Machine (SVM) regression was employed to estimate missing conditional attribute values, illustrating the efficacy of machine learning in enhancing data completeness, but not with large datasets [32].…”

Section: Introductionmentioning

confidence: 99%

Research on Missing Value Imputation to Improve the Validity of Air Quality Data Evaluation on the Qinghai-Tibetan Plateau

Wang,

Liu,

et al. 2023

Atmosphere

View full text Add to dashboard Cite

In the Qinghai-Tibet Plateau region, operational deficiencies and limited maintenance capacities often impair automatic air quality monitoring stations. This results in frequent data omissions, compromising the reliability of environmental assessment data. Therefore, an effective data imputation method is required to address the gaps in observational records. Utilizing a Sequence-to-Sequence framework, we introduce a model termed Bidirectional Recurrent Imputation for Time Series-Attention-based Long Short-Term Memory (BRITS-ALSTM). The encoder of BRITS-ALSTM applies BRITS to integrate single-station historical characteristics with multi-station correlation features. Concurrently, the decoder employs LSTM within an attention mechanism to capitalize on previously observed data, thereby generating hourly imputations for missing air quality data values. The model was trained using six types of air quality data from 16 stations across Qinghai Province. Through localized testing and parameter optimization, BRITS-ALSTM achieved a reduction in mean relative error (MRE) by 74.88% compared to the baseline mean-filling approach. Additionally, ablation studies demonstrated an improvement in the coefficient of determination R-squared (R2) from 0.67 to 0.76, outperforming the standalone BRITS. Consequently, BRITS-ALSTM enhances the accuracy of air quality data evaluations in the Tibetan Plateau and offers an efficacious strategy for data imputation in elevated terrains.

show abstract

“…Urban development and expansion have caused changes in both climatic and atmospheric conditions in recent years (5). These changes have affected the stability of the natural environment and the health of people especially those who live in urban areas (6).…”

Section: Introductionmentioning

confidence: 99%

Trend analysis of Humidex as a heat discomfort index using Mann-Kendall and Sen’s slope estimator statistical tests

Ghalhari

Dehghan

Asghari

2022

Environ Health Eng Manag

View full text Add to dashboard Cite

Background: The aim of this research was to assess the Humidex (HD) trends as a thermal discomfort index by analyzing meteorological data during a 30-year period of summertime in Iran. Methods: For this purpose, data regarding average temperature and relative humidity were collected daily from 40 different synoptic meteorological stations during a 30-year statistical period (1985-2014). The HD index was calculated based on temperature and relative humidity according to an equation introduced by Masterton and Richardson. The Mann-Kendall and Sen’s slope tests were performed to analyze the changing trend of the HD. Results: Based on the findings, in 72% of the meteorological stations, the HD followed an upward trend, so that 40% of them was statistically significant. The changing trends in temperature during summertime throughout the studied years fluctuated greatly but generally, in many regions such as the arid, semi-arid, and humid regions, this trend was mostly incremental. Also, the changing trends in relative humidity in all regions was decremental throughout the years under study. Conclusion: The changing trend of the HD, which is based on temperature and humidity, was incremental in arid and semi-arid regions and decremental in the Mediterranean and humid regions.

show abstract

Application of imputation methods for missing values of PM₁₀ and O₃ data: Interpolation, moving average and K-nearest neighbor methods

Cited by 27 publications

References 50 publications

Prediction of daily fine particulate matter (PM_2.5) concentration in Aksaray, Turkey: Temporal variation, meteorological dependence, and employing artificial neural network

Prediction of daily fine particulate matter (PM_2.5) concentration in Aksaray, Turkey: Temporal variation, meteorological dependence, and employing artificial neural network

Research on Missing Value Imputation to Improve the Validity of Air Quality Data Evaluation on the Qinghai-Tibetan Plateau

Trend analysis of Humidex as a heat discomfort index using Mann-Kendall and Sen’s slope estimator statistical tests

Contact Info

Product

Resources

About

Application of imputation methods for missing values of PM10 and O3 data: Interpolation, moving average and K-nearest neighbor methods

Cited by 27 publications

References 50 publications

Prediction of daily fine particulate matter (PM2.5) concentration in Aksaray, Turkey: Temporal variation, meteorological dependence, and employing artificial neural network

Prediction of daily fine particulate matter (PM2.5) concentration in Aksaray, Turkey: Temporal variation, meteorological dependence, and employing artificial neural network

Research on Missing Value Imputation to Improve the Validity of Air Quality Data Evaluation on the Qinghai-Tibetan Plateau

Trend analysis of Humidex as a heat discomfort index using Mann-Kendall and Sen’s slope estimator statistical tests

Contact Info

Product

Resources

About

Application of imputation methods for missing values of PM₁₀ and O₃ data: Interpolation, moving average and K-nearest neighbor methods

Prediction of daily fine particulate matter (PM_2.5) concentration in Aksaray, Turkey: Temporal variation, meteorological dependence, and employing artificial neural network

Prediction of daily fine particulate matter (PM_2.5) concentration in Aksaray, Turkey: Temporal variation, meteorological dependence, and employing artificial neural network