2023
DOI: 10.1590/1678-4324-2023220365
|View full text |Cite
|
Sign up to set email alerts
|

A Big Data Cleaning Method for Drinking-Water Streaming Data

Abstract: A HA_Cart_AdaBoost model is proposed to clean the data in drinking-water-quality data. First, the data that do not follow the normal distribution are regarded as outliers and eliminated. Next, the optimal control theory of nonlinear partial differential equations (PDEs) is introduced into the cart decision tree, and the cart decision with the specified depth is used. As a weak classifier of AdaBoost, the tree uses the HA_Cart_AdaBoost model to compensate for the eliminated data, then it fits and predicts the m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 22 publications
0
1
0
Order By: Relevance
“…Thus, new approaches were suggested in the literature to address anomalies in the big data context, such as in [18], where the authors explore anomalies in the banking sector caused by big data technologies, specifically addressing credit card discrepancies and utilizing a toolkit for assessing incongruities in a Wireless Application Protocol (WAP) instrument. Likewise, in [19], a model is proposed to effectively cleanse drinking-water-quality for big data. Firstly, data that deviate from the normal distribution are identified as outliers and removed.…”
Section: Related Workmentioning
confidence: 99%
“…Thus, new approaches were suggested in the literature to address anomalies in the big data context, such as in [18], where the authors explore anomalies in the banking sector caused by big data technologies, specifically addressing credit card discrepancies and utilizing a toolkit for assessing incongruities in a Wireless Application Protocol (WAP) instrument. Likewise, in [19], a model is proposed to effectively cleanse drinking-water-quality for big data. Firstly, data that deviate from the normal distribution are identified as outliers and removed.…”
Section: Related Workmentioning
confidence: 99%