2018 IEEE International Conference on Big Data (Big Data) 2018
DOI: 10.1109/bigdata.2018.8622640
|View full text |Cite
|
Sign up to set email alerts
|

An Approach for Validating Quality of Datasets for Machine Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 17 publications
(10 citation statements)
references
References 18 publications
0
10
0
Order By: Relevance
“…This is a truism in ML; including local data into a general model invariably improves local performance [25]. Less obvious is how much local data is needed [26], and what quality [diversity, accuracy] of the local data is needed [27]. Another question is how best to utilise local data: via incorporation into an existing model (transfer learning [28]), or by building a new model using both the general data (here the MARCO training set) and local data [29].…”
Section: Plos Onementioning
confidence: 99%
“…This is a truism in ML; including local data into a general model invariably improves local performance [25]. Less obvious is how much local data is needed [26], and what quality [diversity, accuracy] of the local data is needed [27]. Another question is how best to utilise local data: via incorporation into an existing model (transfer learning [28]), or by building a new model using both the general data (here the MARCO training set) and local data [29].…”
Section: Plos Onementioning
confidence: 99%
“…Another technique is metamorphic testing, which deals with generic transformations of the relationship between inputs and outputs. This makes it possible to validate the model's diversity, fidelity, and veracity in relation to the dataset [28].…”
Section: Related Workmentioning
confidence: 99%
“…7 Max Planck Institute for Molecular Genetics, Berlin, Germany. 8 Alacris Theranostics GmbH, Berlin, Germany. 9 School of Medicine Biostatistics and Medical Informatics Dept., Acibadem University, Istanbul, Turkey.…”
Section: Fundingmentioning
confidence: 99%
“…For example, the first AI-based device to receive market authorization from the FDA was assessed with a large prospective comparative clinical trial including 900 patients from multiple sites [4]. AI technologies must satisfy stringent regulations for approval as medical devices, because (1) the decision support provided is optimized and personalized continuously in real time, according to the phenotype of the patient [7]; (2) the performance of AI depends strongly on the training datasets used [8], resulting in a large risk of AI performing less well in real practice [9][10][11] or on another group of patients or institutions [9]. It is, therefore, essential to assess the performance and safety of AI before its introduction into routine clinical use.…”
Section: Introductionmentioning
confidence: 99%