2018 8th International Conference on Cloud Computing, Data Science &Amp; Engineering (Confluence) 2018
DOI: 10.1109/confluence.2018.8442950
|View full text |Cite
|
Sign up to set email alerts
|

Data Cleaning-A Thorough Analysis and Survey on Unstructured Data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 30 publications
(5 citation statements)
references
References 2 publications
0
5
0
Order By: Relevance
“…The authors of [52] analyzed and visualized unstructured data on air pollution, demonstrating the prevalence of numerous dangerous gases in the atmosphere. The authors emphasized the need for data cleaning in dealing with the problems caused by dirty data, such as null values, incomplete data, missing values, and inconsistent sampling date formats.…”
Section: Cleaning Datamentioning
confidence: 99%
“…The authors of [52] analyzed and visualized unstructured data on air pollution, demonstrating the prevalence of numerous dangerous gases in the atmosphere. The authors emphasized the need for data cleaning in dealing with the problems caused by dirty data, such as null values, incomplete data, missing values, and inconsistent sampling date formats.…”
Section: Cleaning Datamentioning
confidence: 99%
“…Data standardization can pave the way for the use of data science and machine learning driven innovations in the battery development industry. The standardization of conventions and definitions for data are the first step to any data science or machine learning project, where "data cleaning" as it is referred to colloquially amongst data scientists, is known to comprise a substantial portion of the effort in any large-scale data science project (Chu et al, 2016;Kumar and Khosla, 2018;Petrova-Antonova and Tancheva, 2020;Wang and Wang, 2020;Ilyas and Rekatsinas, 2022).…”
Section: Data Science Machine Learning and Battery Datamentioning
confidence: 99%
“…The promising technologies essential for the organization of the digital industry in businesses and the collection of technologies needed to ensure the transformation from the current state of the industry to Industry 4.0 and then to Industry 5.0. A formal overview of Industry 4.0 and Industry 5.0 is also given, allowing the problem to be presented as a mathematical problem [7].…”
Section: Literature Reviewmentioning
confidence: 99%
“…Hasan et al developed natural language processing (NLP) based preprocessed data framework to evaluate sentiment, and integrated the model definition of Bag of Words (BoW) and Term Frequency-Inverse Text Frequency (TF-IDF) [7].…”
Section: Literature Reviewmentioning
confidence: 99%