2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE) 2021
DOI: 10.1109/icse43902.2021.00050
|View full text |Cite
|
Sign up to set email alerts
|

Early Life Cycle Software Defect Prediction. Why? How?

Abstract: Many researchers assume that, for software analytics, "more data is better." We write to show that, at least for learning defect predictors, this may not be true.To demonstrate this, we analyzed hundreds of popular GitHub projects. These projects ran for 84 months and contained 3,728 commits (median values). Across these projects, most of the defects occur very early in their life cycle. Hence, defect predictors learned from the first 150 commits and four months perform just as well as anything else. This mean… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(1 citation statement)
references
References 76 publications
0
1
0
Order By: Relevance
“…Works in vulnerability analysis [10], quality assessment [22], testing [8], and code maintenance [1] have been completed. Early defect prediction offers substantial benefits, as it reduces longterm costs [23]. Predicting potential production issues in code is especially advantageous in an industry setting.…”
Section: Introductionmentioning
confidence: 99%
“…Works in vulnerability analysis [10], quality assessment [22], testing [8], and code maintenance [1] have been completed. Early defect prediction offers substantial benefits, as it reduces longterm costs [23]. Predicting potential production issues in code is especially advantageous in an industry setting.…”
Section: Introductionmentioning
confidence: 99%