Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis 2022
DOI: 10.1145/3533767.3534405
|View full text |Cite
|
Sign up to set email alerts
|

On the use of evaluation measures for defect prediction studies

Abstract: Software defect prediction research has adopted various evaluation measures to assess the performance of prediction models. In this paper, we further stress on the importance of the choice of appropriate measures in order to correctly assess strengths and weaknesses of a given defect prediction model, especially given that most of the defect prediction tasks suffer from data imbalance.Investigating 111 previous studies published between 2010 and 2020, we found out that over a half either use only one evaluatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 18 publications
(9 citation statements)
references
References 41 publications
0
9
0
Order By: Relevance
“…With that in mind, the greater the MCC, the better the solution. We opted to use MCC to assess and compare the accuracy of models as this measure has been strongly recommend in alternative to other previously popular measures, such as F-measure, which have been shown to be biased [51,63,73] when the data is imbalanced (as it is frequently the case in DP). MCC is a more balanced measure which, unlike the other measures, takes into account all the values of the confusion matrix [51,63].…”
Section: Fitness Functionsmentioning
confidence: 99%
See 4 more Smart Citations
“…With that in mind, the greater the MCC, the better the solution. We opted to use MCC to assess and compare the accuracy of models as this measure has been strongly recommend in alternative to other previously popular measures, such as F-measure, which have been shown to be biased [51,63,73] when the data is imbalanced (as it is frequently the case in DP). MCC is a more balanced measure which, unlike the other measures, takes into account all the values of the confusion matrix [51,63].…”
Section: Fitness Functionsmentioning
confidence: 99%
“…We use MCC to evaluate the prediction performance of the models given that we do not target a specific business context [45,51], and, as explained in Section 3.2, MCC is a comprehensive measure, which provides a full picture of the confusion matrix by assessing all its aspects equally. It is also not sensitive to highly imbalanced data and is widely used in the defect prediction and machine learning literature [51,63,73].…”
Section: Evaluation Criteriamentioning
confidence: 99%
See 3 more Smart Citations