Erik Stensrud scite author profile

Abstract-Empirical studies on software prediction models do not converge with respect to the question "which prediction model is best?" The reason for this lack of convergence is poorly understood. In this simulation study, we have examined a frequently used research procedure comprising three main ingredients: a single data sample, an accuracy indicator, and cross validation. Typically, these empirical studies compare a machine learning model with a regression model. In our study, we use simulation and compare a machine learning and a regression model. The results suggest that it is the research procedure itself that is unreliable. This lack of reliability may strongly contribute to the lack of convergence. Our findings thus cast some doubt on the conclusions of any study of competing software prediction models that used this research procedure as a basis of model comparison. Thus, we need to develop more reliable research procedures before we can have confidence in the conclusions of comparative studies of software prediction models.

show abstract

Analyzing data sets with missing data: an empirical evaluation of imputation methods and likelihood-based methods

Myrtveit

Stensrud

Olsson

2001

IIEEE Trans. Software Eng.

211

150

View full text Add to dashboard Cite

AbstractÐMissing data are often encountered in data sets used to construct effort prediction models. Thus far, the common practice has been to ignore observations with missing data. This may result in biased prediction models. In this paper, we evaluate four missing data techniques (MDTs) in the context of software cost modeling: listwise deletion (LD), mean imputation (MI), similar response pattern imputation (SRPI), and full information maximum likelihood (FIML). We apply the MDTs to an ERP data set, and thereafter construct regression-based prediction models using the resulting data sets. The evaluation suggests that only FIML is appropriate when the data are not missing completely at random (MCAR). Unlike FIML, prediction models constructed on LD, MI and SRPI data sets will be biased unless the data are MCAR. Furthermore, compared to LD, MI and SRPI seem appropriate only if the resulting LD data set is too small to enable the construction of a meaningful regression-based prediction model.

show abstract

A controlled experiment to assess the benefits of estimating with analogy and regression models

Myrtveit

Stensrud²

1999

IIEEE Trans. Software Eng.

145

View full text Add to dashboard Cite

Identifying high performance ERP projects

Stensrud

Myrtveit

2003

IIEEE Trans. Software Eng.

View full text Add to dashboard Cite

Learning from high performance projects is crucial for software process improvement. Therefore Index TermsSoftware process improvement, benchmarking, best practice identification, software project management, multivariate productivity measurements, data envelopment analysis (DEA), software development, enterprise resource planning (ERP), software metrics, economies of scale, variable returns to scale.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Erik Stensrud

A simulation study of the model evaluation criterion mmre

Reliability and validity in comparative studies of software prediction models

Analyzing data sets with missing data: an empirical evaluation of imputation methods and likelihood-based methods

A controlled experiment to assess the benefits of estimating with analogy and regression models

Identifying high performance ERP projects

Contact Info

Product

Resources

About