2022
DOI: 10.1016/j.procs.2022.11.349
|View full text |Cite
|
Sign up to set email alerts
|

A comparative study on the effect of data imbalance on software defect prediction

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(2 citation statements)
references
References 27 publications
0
2
0
Order By: Relevance
“…They exhibit variations in code size and include various software metrics. However, it is important to note that the NASA corpus is known to contain noisy attributes [17], have high dimensionality [18], and have imbalanced class records [19]. For example, the NASA JM1 dataset comprises 7,782 records with 1,672 containing defects and 6,110 without defects, each consisting of 22 attributes.…”
Section: Nasa Metrics Data Programmentioning
confidence: 99%
“…They exhibit variations in code size and include various software metrics. However, it is important to note that the NASA corpus is known to contain noisy attributes [17], have high dimensionality [18], and have imbalanced class records [19]. For example, the NASA JM1 dataset comprises 7,782 records with 1,672 containing defects and 6,110 without defects, each consisting of 22 attributes.…”
Section: Nasa Metrics Data Programmentioning
confidence: 99%
“…In another work, Yanbin et al [6] investigated the impact of combining different sampling techniques and ML classifiers on defect prediction performance. While it finds no single optimal combination, it identifies support vector machines and deep learning as the most consistently performing classifiers.…”
mentioning
confidence: 99%