The accurate analysis of data requires high-quality data. However, inconsistencies occur frequently in the actual data and lead to untrustworthy decisions in the downstream data analysis pipeline. In this research, we examine the problem of the detection of incoherence and the repair of the OMD data model (OMD). We propose a framework for data quality evaluation and an OMD repair framework. We formally define a weight-based semantile repair by deletion and have an automated weight generation system that takes into account multiple input criteria. We use multi-criteria decisions based on the correlation, contrast and conflict between multiple criteria that are often necessary in the field of data cleaning. After weight generation, we present a Min-Sum dynamic programming algorithm to find the minimum weight solution. Then we apply evolutionary optimisation techniques and use medical datasets to show improved performance that is practically feasible.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.