2022
DOI: 10.1145/3502287
|View full text |Cite
|
Sign up to set email alerts
|

A Systematic Review on Data Scarcity Problem in Deep Learning: Solution and Applications

Abstract: Recent advancements in deep learning architecture have increased its utility in real-life applications. Deep learning models require a large amount of data to train the model. In many application domains, there is a limited set of data available for training neural networks as collecting new data is either not feasible or requires more resources such as in marketing, computer vision, and medical science. These models require a large amount of data to avoid the problem of overfitting. One of the data space solu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
38
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 112 publications
(39 citation statements)
references
References 85 publications
0
38
0
1
Order By: Relevance
“…The number of patients involved can be limited even in multicentric studies; as in Orsini et al (2017) and Ferrari et al (2020a), where the few patients were unevenly enrolled across Modena, Hong Kong, and Sydney hospitals. Scarcity and sparsity are not always in contrast, and methods exist to limit both their effects (Bansal et al, 2021).…”
Section: Sparsity/scarcity and (Im)balancementioning
confidence: 99%
“…The number of patients involved can be limited even in multicentric studies; as in Orsini et al (2017) and Ferrari et al (2020a), where the few patients were unevenly enrolled across Modena, Hong Kong, and Sydney hospitals. Scarcity and sparsity are not always in contrast, and methods exist to limit both their effects (Bansal et al, 2021).…”
Section: Sparsity/scarcity and (Im)balancementioning
confidence: 99%
“…The deep learning models generally need a large amount of data in training to build sophisticated prediction functions, but they are not good for high-dimensional data with only a small number of samples. This is mainly because almost all deep learning models suffer from the data scarcity problem, i.e., deep learning may have poor performance for datasets with only a limited number of observations [22].…”
Section: Psychiatric Map (Pmap) Diagnosis: a Mislabeled-learning Algo...mentioning
confidence: 99%
“…On the other hand, the existing mislabeled learning techniques may not be able to apply to the high-dimensional SNP data in out context because they are mainly developed for lowdimensional data in which a large number of samples are available. For example, adding a noise filtering in deep neural networks (DNN) may filter samples with corrupted labels, but it can't apply to highdimensional data because the filtering scheme can be too luxury to implement for a dataset only with limited number of samples [1,22]. Thus, it is necessary to seek an effective mislabeling correction technique, which acts as the relabeling function 𝑓 ) , to tackle the problem, to retrieve the most likely ground truth for each sample.…”
Section: Psychiatric Map (Pmap) Diagnosis: a Mislabeled-learning Algo...mentioning
confidence: 99%
See 2 more Smart Citations