Early Life Cycle Software Defect Prediction. Why? How?

Shrikanth, N. C.; Majumder, Suvodeep

doi:10.1109/icse43902.2021.00050

Cited by 10 publications

(1 citation statement)

References 76 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Works in vulnerability analysis [10], quality assessment [22], testing [8], and code maintenance [1] have been completed. Early defect prediction offers substantial benefits, as it reduces longterm costs [23]. Predicting potential production issues in code is especially advantageous in an industry setting.…”

Section: Introductionmentioning

confidence: 99%

Code Revert Prediction with Graph Neural Networks: A Case Study at J.P. Morgan Chase

Pei,

Alamir,

Dolga

et al. 2023

Proceedings of the 1st International Workshop on Software Defect Datasets

View full text Add to dashboard Cite

Code revert prediction, a specialized form of software defect detection, aims to forecast or predict the likelihood of code changes being reverted or rolled back in software development. This task is very important in practice because by identifying code changes that are more prone to being reverted, developers and project managers can proactively take measures to prevent issues, improve code quality, and optimize development processes. However, compared to code defect detection, code revert prediction has been rarely studied in previous research. Additionally, many previous methods for code defect detection relied on independent features but ignored relationships between code scripts. Moreover, new challenges are introduced due to constraints in an industry setting such as company regulation, limited features and large-scale codebase. To overcome these limitations, this paper presents a systematic empirical study for code revert prediction that integrates the code import graph with code features. Different strategies to address anomalies and data imbalance have been implemented including graph neural networks with imbalance classification and anomaly detection. We conduct the experiments on real-world code commit data within J.P. Morgan Chase which is extremely imbalanced in order to make a comprehensive comparison of these different approaches for the code revert prediction problem. CCS CONCEPTS• Software and its engineering → Software maintenance tools;• Computing methodologies → Neural networks.

show abstract

Section: Introductionmentioning

confidence: 99%

Code Revert Prediction with Graph Neural Networks: A Case Study at J.P. Morgan Chase

Pei,

Alamir,

Dolga

et al. 2023

Proceedings of the 1st International Workshop on Software Defect Datasets

View full text Add to dashboard Cite

show abstract

A research landscape on software defect prediction

Taskeen

Khan

Felix

2023

J Software Evolu Process

View full text Add to dashboard Cite

Software defect prediction is the process of identifying defective files and modules that need rigorous testing. In the literature, several secondary studies including systematic reviews, mapping studies, and review studies have been reported. However, no research work such as a tertiary study that combines secondary studies has focused on providing a landscape of software defect prediction useful to understand the body of knowledge. Motivated by this, we intend to perform a tertiary study by following a systematic literature review protocol to provide a research landscape of the targeted domain. We synthesize the quality of the secondary studies and investigate the employed techniques and the performance evaluation measures for evaluating the software defect prediction model. Furthermore, this study aims at exploring different datasets employed in the reported experimentation. Moreover, the current study intends at highlighting the research trends, gaps, and opportunities in the targeted research domain. The results indicate that none of the reported defect prediction techniques can be regarded as the best; however, the reported techniques performed better in different testing situations. In addition, machine learning (ML)‐based techniques perform better than traditional statistical techniques mainly due to the potential of discovering the defects and generating generalized results. Moreover, the obtained results highlight the need for further work in the domain of ML‐based techniques. Furthermore, publicly available datasets should be considered for experimentation or replication purposes. The potential future work can focus on data quality, ethical ML, cross‐project defect prediction, early defect prediction process, class imbalance problem, and model overfitting.

show abstract