2023
DOI: 10.17576/jkukm-2023-35(1)-18
|View full text |Cite
|
Sign up to set email alerts
|

Comparisons of Various Imputation Methods for Incomplete Water Quality Data: A Case Study of The Langat River, Malaysia

Abstract: In this study, the ability of numerous statistical and machine learning models to impute water quality data was investigated at three monitoring stations along the Langat River in Malaysia. Inconsistencies in the percentage of missing data between monitoring stations (varying from 20 percent (moderate) to over 50 percent (high)) represent the greatest obstacle of the study. The main objective was to select the best method for imputation and compare whether there are differences between the methods used by the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(2 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…This method addresses potential data loss due to various errors, such as connection or power issues, ensuring the integrity and continuity of the dataset for the AE analysis. This decision is made on the basis that large gaps in the data, more than 25% missing values, would significantly degrade the quality and accuracy of the data analysis [31]. If the gap is greater than this threshold, it is omitted.…”
Section: Ae-lstmmentioning
confidence: 99%
“…This method addresses potential data loss due to various errors, such as connection or power issues, ensuring the integrity and continuity of the dataset for the AE analysis. This decision is made on the basis that large gaps in the data, more than 25% missing values, would significantly degrade the quality and accuracy of the data analysis [31]. If the gap is greater than this threshold, it is omitted.…”
Section: Ae-lstmmentioning
confidence: 99%
“…Data pre-processing involves a series of data preparation process used to handle missing value. Columns in the dataset which are having missing values replaced with the mean of remaining values in the column [42].…”
Section: B Data Preparationmentioning
confidence: 99%