George G. Cabral scite author profile

Just-in-Time Software Defect Prediction (JIT-SDP) is an SDP approach that makes defect predictions at the software change level. Most existing JIT-SDP work assumes that the characteristics of the problem remain the same over time. However, JIT-SDP may suffer from class imbalance evolution. Specifically, the imbalance status of the problem (i.e., how much underrepresented the defect-inducing changes are) may be intensified or reduced over time. If occurring, this could render existing JIT-SDP approaches unsuitable, including those that rebuild classifiers over time using only recent data. This work thus provides the first investigation of whether class imbalance evolution poses a threat to JIT-SDP. This investigation is performed in a realistic scenario by taking into account verification latency-the often overlooked fact that labeled training examples arrive with a delay. Based on 10 GitHub projects, we show that JIT-SDP suffers from class imbalance evolution, significantly hindering the predictive performance of existing JIT-SDP approaches. Compared to state-of-the-art class imbalance evolution learning approaches, the predictive performance of JIT-SDP approaches was up to 97.2% lower in terms of g-mean. Hence, it is essential to tackle class imbalance evolution in JIT-SDP. We then propose a novel class imbalance evolution approach for the specific context of JIT-SDP. While maintaining top ranked g-means, this approach managed to produce up to 63.59% more balanced recalls on the defect-inducing and clean classes than state-of-theart class imbalance evolution approaches. We thus recommend it to avoid overemphasizing one class over the other in JIT-SDP.

show abstract

An investigation of cross-project learning in online just-in-time software defect prediction

Tabassum

Minku

Feng³

et al. 2020

View full text Add to dashboard Cite

Combining nearest neighbor data description and structural risk minimization for one-class classification

Cabral

Oliveira

Cahu

2008

Neural Comput & Applic

View full text Add to dashboard Cite

One-class classification is an important problem with applications in several different areas such as novelty detection, anomaly detection, outlier detection and machine monitoring. In this paper, we propose two novel methods for one-class classification, referred to as NNDDSRM and kNNDDSRM. The methods are based on the principle of structural risk minimization and the nearest neighbor data description (NNDD) one-class classifier. Experiments carried out using both artificial and real-world datasets show that the proposed methods are able to significantly reduce the number of stored prototypes in comparison to NNDD. The experimental results also show that the proposed methods outperformed NNDD-in terms of the area under the receiver operating characteristic (ROC) curve-on four of the five datasets considered in the experiments and had a similar performance on the remaining one.

show abstract

A Novel Method for One-Class Classification Based on the Nearest Neighbor Data Description and Structural Risk Minimization

Cabral

Oliveira

Cahu

2007

View full text Add to dashboard Cite

Preprocessing unbalanced data using weighted support vector machines for prediction of heart disease in children

Tavares

Oliveira

Cabral

et al. 2013

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

George G. Cabral

Class Imbalance Evolution and Verification Latency in Just-in-Time Software Defect Prediction

An investigation of cross-project learning in online just-in-time software defect prediction

Combining nearest neighbor data description and structural risk minimization for one-class classification

A Novel Method for One-Class Classification Based on the Nearest Neighbor Data Description and Structural Risk Minimization

Preprocessing unbalanced data using weighted support vector machines for prediction of heart disease in children

Contact Info

Product

Resources

About