2022
DOI: 10.3389/frai.2021.739432
|View full text |Cite
|
Sign up to set email alerts
|

Co-Inference of Data Mislabelings Reveals Improved Models in Genomics and Breast Cancer Diagnostics

Abstract: Mislabeling of cases as well as controls in case–control studies is a frequent source of strong bias in prognostic and diagnostic tests and algorithms. Common data processing methods available to the researchers in the biomedical community do not allow for consistent and robust treatment of labeled data in the situations where both, the case and the control groups, contain a non-negligible proportion of mislabeled data instances. This is an especially prominent issue in studies regarding late-onset conditions,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

2
0

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 44 publications
0
2
0
Order By: Relevance
“…1 G and H , EOS with from SPA (EOS+SPA, red dashed lines), allows a statistically significant improvement of prediction performance (measured with the common performance measure area under curve [AUC]) for all of the tested mislabeling proportions p for all of the considered biomedical examples. As was shown recently, coinference of data mislabelings can significantly improve predictive performance of supervised classifiers ( 17 ). Application of the EOS algorithm with model loss function from SPA (EOS+SPA) allows achieving AUC of 0.96 and accuracy of ( SI Appendix , Fig.…”
mentioning
confidence: 66%
“…1 G and H , EOS with from SPA (EOS+SPA, red dashed lines), allows a statistically significant improvement of prediction performance (measured with the common performance measure area under curve [AUC]) for all of the tested mislabeling proportions p for all of the considered biomedical examples. As was shown recently, coinference of data mislabelings can significantly improve predictive performance of supervised classifiers ( 17 ). Application of the EOS algorithm with model loss function from SPA (EOS+SPA) allows achieving AUC of 0.96 and accuracy of ( SI Appendix , Fig.…”
mentioning
confidence: 66%
“…The size of the training set, as well as the accuracy of prior data used in training, play a very central role also in the denoising and segmentation of CT images, where the number of instances in the training set T is significantly smaller than the feature space dimension D , corresponding to the number of voxels. A problem characterized by pertains to the so-called “small-data learning challenge” [ 34 , 35 , 36 , 37 , 38 ], and represents a scenario in which ML and DL approaches are prone to quickly overfit the small training set (which in addition often also contains missig data or incorrectly labeled data) and to achieve an unsatisfactory performance on the validation set [ 39 , 40 , 41 , 42 , 43 ]. To tackle this issue, several alternative approaches have been proposed [ 44 , 45 ], with transfer learning representing one of the most powerful alternatives [ 46 ].…”
Section: Introductionmentioning
confidence: 99%