2022
DOI: 10.3390/bdcc6040153
|View full text |Cite
|
Sign up to set email alerts
|

An Advanced Big Data Quality Framework Based on Weighted Metrics

Abstract: While big data benefits are numerous, the use of big data requires, however, addressing new challenges related to data processing, data security, and especially degradation of data quality. Despite the increased importance of data quality for big data, data quality measurement is actually limited to few metrics. Indeed, while more than 50 data quality dimensions have been defined in the literature, the number of measured dimensions is limited to 11 dimensions. Therefore, this paper aims to extend the measured … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 16 publications
(6 citation statements)
references
References 37 publications
0
3
0
Order By: Relevance
“…DAFQA can handle all data formats and creates Python code to verify the six primary data quality dimensions. In [41] proposed a Big Data Quality Assessment Framework with 12 metrics, including four new ones: Integrity, Accessibility, Ease of Manipulation, and Security, and measured data quality with weights at three levels and five quality aspects for a macro view. In [107] proposed…”
Section: Related Workmentioning
confidence: 99%
“…DAFQA can handle all data formats and creates Python code to verify the six primary data quality dimensions. In [41] proposed a Big Data Quality Assessment Framework with 12 metrics, including four new ones: Integrity, Accessibility, Ease of Manipulation, and Security, and measured data quality with weights at three levels and five quality aspects for a macro view. In [107] proposed…”
Section: Related Workmentioning
confidence: 99%
“…This metric refers to data respecting the expected rules and constraints regarding data type [16]. Thus, a conform dataset means that all data values in a particular column or field have the same data type and format.…”
Section: ) Conformitymentioning
confidence: 99%
“…As misspelled words can make it difficult for users to search, filter, or analyze the data accurately. The ability to extract information from data is called data readability [16]. Thus, data readability patterns should highlight how much data is readable and represent meaningful insights.…”
Section: ) Readabilitymentioning
confidence: 99%
“…To validate the resulting dataset (sequences_df_prep_EN.csv) and facilitate the reuse, a data profiling analysis was conducted. First, it was important to guarantee the consistency, which could be defined as data presented in the same standard structure and its correctness in relationship with other data 16 . Therefore, it was validated that www.nature.com/scientificdata www.nature.com/scientificdata/ the elements within the interaction sequences correspond to the elements on the UI JSON file, and the users' IDs exist in the users.csv file.…”
Section: Technical Validationmentioning
confidence: 99%