Trend Analysis and Issue Prediction in Large-Scale Open Source Systems

Kenmei, B.; Antoniol, Giuliano; Penta, Massimiliano Di

doi:10.1109/csmr.2008.4493302

Cited by 37 publications

(40 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In all figures we also observe that the further we move the training period to the past the more likely the prediction quality would drop down to almost random (≈ 0.5). This provides some evidence to the statement the further back you go in time the more the prediction deteriorates (Kenmei et al 2008). More formally, from April 2002 to July 2003 the model exhibits a stable good prediction quality.…”

Section: Finding Periods Of Stability and Changementioning

confidence: 91%

Time variance and defect prediction in software projects

et al. 2011

View full text Add to dashboard Cite

It is crucial for a software manager to know whether or not one can rely on a bug prediction model. A wrong prediction of the number or the location of future bugs can lead to problems in the achievement of a project's goals. In this paper we first verify the existence of variability in a bug prediction model's accuracy over time both visually and statistically. Furthermore, we explore the reasons for such a high variability over time, which includes periods of stability and variability of prediction quality, and formulate a decision procedure for evaluating prediction models before applying them. To exemplify our findings we use data from four open source projects and empirically identify various project features that influence the defect prediction quality. Specifically, we observed that a change in the number of authors editing a file and the number of defects fixed by them influence the prediction quality. Finally, we introduce an approach to estimate the accuracy of prediction models that helps a project manager decide when to rely on a prediction model. Our findings suggest that one should be aware of the periods of stability and variability of prediction quality and should use approaches such as ours to assess their models' accuracy in advance.

show abstract

Section: Finding Periods Of Stability and Changementioning

confidence: 91%

Time variance and defect prediction in software projects

et al. 2011

View full text Add to dashboard Cite

show abstract

“…For Spring they also found that the single best predictor was the number of times a file had been changed, followed by the number of authors of these changes. Kenmei et al [10] collected bi-weekly snapshots over five years for three systems, one of which was Eclipse. From every snapshot they extracted the number of lines of code and identified the number of new change requests, i.e.…”

Section: Related Workmentioning

confidence: 99%

Characterizing the roles of classes and their fault-proneness through change metrics

Steff

Russo

2012

Proceedings of the ACM-IEEE International Symposium on Empirical Software Engineering and Measurement

View full text Add to dashboard Cite

Many approaches to determine the fault-proneness of code artifacts rely on historical data of and about these artifacts. These data include the code and how it was changed over time, and information about the changes from version control systems. Each of these can be considered at different levels of granularity. The level of granularity can substantially influence the estimated fault-proneness of a code artifact. Typically, the level of detail oscillates between releases and commits on the one hand, and single lines of code and whole files on the other hand. Not every information may be readily available or feasible to collect at every level, though, nor does more detail necessarily improve the results. Our approach is based on time series of changes in method-level dependencies and churn on a commit-to-commit basis for two systems, Spring and Eclipse. We identify sets of classes with distinct properties of the time series of their change histories. We differentiate between classes based on temporal patterns of change. Based on this differentiation, we show that our measure of structural change in concert with its complement, churn, effectively indicates fault-proneness in classes. We also use windows on time series to select sets of commits and show that changes over short amounts of time do effectively indicate the fault-proneness of classes.

show abstract

“…Kenmei et al also use ARIMA models for predicting change requests, but build a different model for each of the four open-source systems (including Eclipse) they study [6]. They adopted a sampling policy of aggregating data every two weeks, so that they would have longer time series than if using monthly data.…”

Section: Related Workmentioning

confidence: 99%

“…The LjungBox test confirms that the random walk model is significantly less robust than our prediction model and, therefore, not appropriate. [6], is an ARIMA(5,0,5)(0,0,0). This is clearly the strongest alternative to the model proposed in this paper among those used in this comparison.…”

Section: Hypothesis H3mentioning

confidence: 99%