2020
DOI: 10.1016/j.cie.2019.106266
|View full text |Cite
|
Sign up to set email alerts
|

Integrating TANBN with cost sensitive classification algorithm for imbalanced data in medical diagnosis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
33
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 75 publications
(39 citation statements)
references
References 47 publications
0
33
0
Order By: Relevance
“…In future works, we attempt to explore data imbalance issue. The potential techniques include up/down sampling and cost‐ sensitive learning [22]. Besides, we will also investigate how to perform domain adaptation when source and target domains have different number of retinopathies, for instance the type of retinopathies in the target domain is a subset of that in the source domain.…”
Section: Conclusion and Discussionmentioning
confidence: 99%
“…In future works, we attempt to explore data imbalance issue. The potential techniques include up/down sampling and cost‐ sensitive learning [22]. Besides, we will also investigate how to perform domain adaptation when source and target domains have different number of retinopathies, for instance the type of retinopathies in the target domain is a subset of that in the source domain.…”
Section: Conclusion and Discussionmentioning
confidence: 99%
“…The best model without oversampling (ie, the LR model) also yielded remarkable findings (AUC 0.950; F1 0.604; sensitivity 0.764; specificity 0.919), and SMOTE oversampling further improved the model performance (AUC 0.960; F1 0.668; sensitivity 0.845; specificity 0.929). Considering the propensity of health care data to be imbalanced [51][52][53][54], our results suggest the need for further analysis of oversampling methods for medical data sets. Self-supervision [55,56] may also help improve the performance of models using imbalanced medical data sets; in particular, future studies should evaluate the impact of self-supervision on tabular medical data.…”
Section: Principal Findingsmentioning
confidence: 93%
“…We note that the best model without oversampling (LR) also yielded strong results (AUC: 0.950, F1: 0.604, sensitivity: 0.764, specificity: 0.919), and the SMOTE oversampling method improved the performance further (AUC: 0.960, F1: 0.668, sensitivity: 0.845, specificity: 0.929). Given the propensity of imbalanced data in healthcare, [44][45][46][47] our results suggest the need for further analysis of oversampling methods for medical datasets. Self-supervision, [48,49] may also help in improving performance on imbalanced medical datasets;…”
Section: Main Findingsmentioning
confidence: 93%