2022
DOI: 10.1007/s10462-022-10371-6
|View full text |Cite
|
Sign up to set email alerts
|

Data quality issues in software fault prediction: a systematic literature review

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(5 citation statements)
references
References 200 publications
0
5
0
Order By: Relevance
“…The class overlap problem 3 refers to the existence of some samples in an unbalanced dataset as valid samples in multiple categories, which makes the prediction boundary vague and affects the model prediction accuracy 4 . In the current SDP study, the class overlap problem is mainly related to noise cleaning.…”
Section: Related Workmentioning
confidence: 99%
“…The class overlap problem 3 refers to the existence of some samples in an unbalanced dataset as valid samples in multiple categories, which makes the prediction boundary vague and affects the model prediction accuracy 4 . In the current SDP study, the class overlap problem is mainly related to noise cleaning.…”
Section: Related Workmentioning
confidence: 99%
“…A comprehensive systematic literature review 38 focusing on data quality challenges in software fault prediction is discussed. The paper delves into the critical aspects related to the quality of data used for training and evaluating software fault prediction models.…”
Section: In-depth Review Of Existing Machine Learning Models Used For...mentioning
confidence: 99%
“…They found that two imbalanced variants of the bagging classifier performed better than other techniques, even in cross-project settings [45]. Bhandari et al (2023) conducted a systematic literature review on data quality issues in software fault prediction datasets. They analyzed 145 primary studies and identified various data quality issues, such as class imbalance and dimensionality, suggesting the need for further research in this area [46].…”
Section: Cross-project Prediction For Software Fault Predictionmentioning
confidence: 99%
“…Bhandari et al (2023) conducted a systematic literature review on data quality issues in software fault prediction datasets. They analyzed 145 primary studies and identified various data quality issues, such as class imbalance and dimensionality, suggesting the need for further research in this area [46]. Rathi et al (2023) performed an extensive evaluation of different combinations of data sampling and feature selection techniques to improve software fault prediction models.…”
Section: Cross-project Prediction For Software Fault Predictionmentioning
confidence: 99%