2021
DOI: 10.1038/s41598-021-01681-w
|View full text |Cite
|
Sign up to set email alerts
|

Effect of data leakage in brain MRI classification using 2D convolutional neural networks

Abstract: In recent years, 2D convolutional neural networks (CNNs) have been extensively used to diagnose neurological diseases from magnetic resonance imaging (MRI) data due to their potential to discern subtle and intricate patterns. Despite the high performances reported in numerous studies, developing CNN models with good generalization abilities is still a challenging task due to possible data leakage introduced during cross-validation (CV). In this study, we quantitatively assessed the effect of a data leakage cau… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
44
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 60 publications
(46 citation statements)
references
References 65 publications
1
44
0
1
Order By: Relevance
“…This calls into question the reliability of the assessments and hindering an objective comparison between research outcomes. This problem has also been demonstrated in 3D magnetic resonance (MR) imaging studies 13 and in digital pathology 25 , where data leakage between the training and testing sets resulted in over-optimistic classification accuracy (>29% slide level classification accuracy in MR studies and up to 41% higher accuracy in digital pathology). Moreover, greater attention should be paid to the structure of datasets made available to the research community to avoid biasing the evaluation of different methods and undermining the usefulness of open-access datasets.…”
Section: Discussionmentioning
confidence: 92%
See 1 more Smart Citation
“…This calls into question the reliability of the assessments and hindering an objective comparison between research outcomes. This problem has also been demonstrated in 3D magnetic resonance (MR) imaging studies 13 and in digital pathology 25 , where data leakage between the training and testing sets resulted in over-optimistic classification accuracy (>29% slide level classification accuracy in MR studies and up to 41% higher accuracy in digital pathology). Moreover, greater attention should be paid to the structure of datasets made available to the research community to avoid biasing the evaluation of different methods and undermining the usefulness of open-access datasets.…”
Section: Discussionmentioning
confidence: 92%
“…As for the case of many of the reviewed medical image analysis challenges 8 , one aspect that is sometimes missing or not well described is how the testing dataset is generated from the original pool of data. Moreover, there are examples where the preparation of the testing dataset was described, but its overlap with the training set was not considered [9][10][11][12][13] , undermining the reliability of the reported results. Focusing on deep-learning applications for OCT, depending on the acquisition set-up, volumes are usually acquired with micrometer resolution in the x, y and z directions in a restricted field of view, with tissue structures that are alike and affected by similar noise.…”
Section: Introductionmentioning
confidence: 99%
“…We repeated the 5-fold CV ten times to compensate for the sampling bias issue. It is essential to underline that, unlike other works ( 74 , 75 ), we performed a patient-based splitting, and thus avoiding results inflated by the phenomenon of data leakage ( 80 ). We used the average value of the AUROC in the validation set to select the best ML/DL frameworks, and evaluated the generalizability on test sets allocated in the hold-out procedure.…”
Section: Discussionmentioning
confidence: 99%
“…However, these excellent results are due to data leakage, and studies have shown that the incorrect division of the training set and test set is one of the major causes of data leakage. 45 In literatures, [16][17][18] the authors extracted multiple 2D slice images from each 3D MRI image for classification experiments (among them, Billones et al 16 extracted twenty 2D slice images from each 3D MRI image, Sarraf et al 17 extracted all 2D slice images with nonzero average pixels from each 3D MRI image, and Naz et al 18 selected at least three 2D slice images from each 3D MRI image), and these slices were randomly divided into a training set and test set according to a certain ratio. These incorrect data division methods divide different slices of the same subject into a training set and a test set, resulting in data leakage.…”
Section: Classification Results For Data Augmentationmentioning
confidence: 99%