2022 4th International Conference on Smart Systems and Inventive Technology (ICSSIT) 2022
DOI: 10.1109/icssit53264.2022.9716355
|View full text |Cite
|
Sign up to set email alerts
|

Study on Missing Values and Outlier Detection in Concurrence with Data Quality Enhancement for Efficient Data Processing

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 31 publications
0
4
0
Order By: Relevance
“…The initial stage of data preparation involves reviewing the dataset for missing values, which can compromise the reliability of the model (Vinisha & Helen, 2022). Outlier detection is then crucial, as these abnormal values can significantly impact statistical estimates and data analysis results (Kwak & Kim, 2017).…”
Section: Data Preprocessingmentioning
confidence: 99%
“…The initial stage of data preparation involves reviewing the dataset for missing values, which can compromise the reliability of the model (Vinisha & Helen, 2022). Outlier detection is then crucial, as these abnormal values can significantly impact statistical estimates and data analysis results (Kwak & Kim, 2017).…”
Section: Data Preprocessingmentioning
confidence: 99%
“…For example, RE data accuracy is highly dependent on weather conditions, which include numerous variables (e.g., dust particles accumulation on sensors) (Zell et al, 2015). Additionally, equipment failure that causes missing data, device misalignment, and sensor cleaning procedure or calibration can introduce data variation (Vinisha & Sujihelen, 2022). This highlighted a vital data issue with real-time energy generation forecasting.…”
Section: Big Datamentioning
confidence: 99%
“…The solar irradiance data are collected using different devices that measure incidents on a sensor surface. However, in a hostile climate such as Saudi Arabia, the devices' performance is affected by various elements, such as dust particles accumulation on sensors (Zell et al, 2015) and missing data due to equipment failure, sensor cleaning procedure, or calibration (Vinisha & Sujihelen, 2022). As a result, this can influence data quality severely due to the different anomaly sources.…”
Section: Introductionmentioning
confidence: 99%
“…When the absolute value of the heading difference near the missing value is within 5°, it is determined as a straight line trajectory; otherwise, it is a curve trajectory (Gao et al, 2021 ). After calculating the number of missing values according to Table 1 , routes with straight continuous missing values between 0 and 5% of the total route length and curved continuous missing values between 0 and 2% of the route were interpolated, and trajectories exceeding these criteria were rejected (Vinisha and Sujihelen, 2022 ). As the linear interpolation algorithm has high operating efficiency, the route distance designed in this study is relatively short, and the routes are relatively smooth, it is suitable to use linear interpolation for implementation (Huang et al, 2011 ).…”
Section: Production Of Datasetsmentioning
confidence: 99%