2023
DOI: 10.1038/s41598-023-45532-2
|View full text |Cite
|
Sign up to set email alerts
|

Augmentation strategies for an imbalanced learning problem on a novel COVID-19 severity dataset

Daniel Schaudt,
Reinhold von Schwerin,
Alexander Hafner
et al.

Abstract: Since the beginning of the COVID-19 pandemic, many different machine learning models have been developed to detect and verify COVID-19 pneumonia based on chest X-ray images. Although promising, binary models have only limited implications for medical treatment, whereas the prediction of disease severity suggests more suitable and specific treatment options. In this study, we publish severity scores for the 2358 COVID-19 positive images in the COVIDx8B dataset, creating one of the largest collections of publicl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 71 publications
0
2
0
Order By: Relevance
“…We used Adaptive Synthetic Sampling (ADASYN) to oversample the minority class and address the class imbalance problem [16]. ADASYN mitigates this issue by adaptively generating synthetic samples for the minority class based on the local density distribution of existing instances [17]. The algorithm works mainly in four steps: (1) the data distribution analysis of all the classes, (2) the density estimation and identification of k-nearest neighbors of all instances in the minority classes, (3) the difficulty level measurement of minority and majority class instances, and (4) adaptive sampling based on the difficulty ratio to determine the number of synthetic samples needed for each minority class instance.…”
Section: Data Augmentationmentioning
confidence: 99%
“…We used Adaptive Synthetic Sampling (ADASYN) to oversample the minority class and address the class imbalance problem [16]. ADASYN mitigates this issue by adaptively generating synthetic samples for the minority class based on the local density distribution of existing instances [17]. The algorithm works mainly in four steps: (1) the data distribution analysis of all the classes, (2) the density estimation and identification of k-nearest neighbors of all instances in the minority classes, (3) the difficulty level measurement of minority and majority class instances, and (4) adaptive sampling based on the difficulty ratio to determine the number of synthetic samples needed for each minority class instance.…”
Section: Data Augmentationmentioning
confidence: 99%
“…We employ an augmentation pipeline for all classification models to increase image variations and reduce overfitting during model training, which is common for many image domains [53][54][55]. This pipeline was inspired by the winning solution to the 2021 SIIM-FISABIO-RSNA Machine Learning COVID-19 Challenge [56] and is shown in Table 4.…”
Section: Classification Modelsmentioning
confidence: 99%