2023
DOI: 10.1007/s10489-023-04828-6
|View full text |Cite
|
Sign up to set email alerts
|

Miss-gradient boosting regression tree: a novel approach to imputing water treatment data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 66 publications
0
7
0
Order By: Relevance
“…For example, Wang et al [21] developed a forward variable selection approach based on K-nearest-neighbor mutual information to eliminate redundant variables from the wastewater quality dataset and used support vector regression (SVR) to produce the effluent BOD. Zhang et al [8] used an updated gradient boosting regression tree (GBRT) to impute the missing value of effluent BOD, accounting for the missing measurement of sewage indicators generated by anomalous sensors in the sewage treatment process. Wang et al [22] employed random forest (RF) enhanced by latent Dirichlet allocation to reduce 12-dimension auxiliary feature vectors to 3-dimension feature vectors.…”
Section: Machine Learning Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…For example, Wang et al [21] developed a forward variable selection approach based on K-nearest-neighbor mutual information to eliminate redundant variables from the wastewater quality dataset and used support vector regression (SVR) to produce the effluent BOD. Zhang et al [8] used an updated gradient boosting regression tree (GBRT) to impute the missing value of effluent BOD, accounting for the missing measurement of sewage indicators generated by anomalous sensors in the sewage treatment process. Wang et al [22] employed random forest (RF) enhanced by latent Dirichlet allocation to reduce 12-dimension auxiliary feature vectors to 3-dimension feature vectors.…”
Section: Machine Learning Methodsmentioning
confidence: 99%
“…The boosting algorithm can integrate simple base learners to reduce bias, and the bagging algorithm can sample different subsets of training data to train individual base learners to reduce variance. This combination of boosting and bagging has the potential to significantly address the issue of accurately predicting the effluent BOD with small-sized datasets in WWTPs [8].…”
Section: Overall Architecture Of En-wbfmentioning
confidence: 99%
See 2 more Smart Citations
“…The data preprocessing stage is optimized by capitalizing on domain knowledge related to ships and utilizing isolation forests for anomaly detection. To address completeness anomalies, a new technique called Miss-GBRT was developed [12]. Miss-GBRT can impute missing values in wastewater quality data even when the training data are limited.…”
Section: Related Workmentioning
confidence: 99%