In the multiple regression analysis, most frequently occurring problems are the presence of multicollinearity and outliers. They produce undesirable effects on the least squares estimates of regression parameters. The Jackknifed Ridge Regression estimator and M-estimator have been proposed to overcome multicollinearity and outliers respectively. The Jackknifed Ridge Regression estimator is obtained by shrinking the Ordinary Least Squares estimator. Since the Ordinary Least Squares estimator is sensitive to outliers, the Jackknife Ridge Regression estimator is also sensitive to outliers. To overcome the combined problem of multicollinearity and outliers, we propose a new estimator namely, Jackknifed Ridge M-estimator. This estimator is obtained by shrinking an M-estimator instead of the Ordinary Least Squares estimator.
Background: Generalized linear models (GLM) are widely used to model social, medical and ecological data. Choosing predictors for building a good GLM is a widely studied problem. Likelihood based procedures like Akaike Information criterion and Bayes Information Criterion are usually used for model selection in GLM. The non-robustness property of likelihood based procedures in the presence of outliers or deviation from assumed distribution of response is widely studied in the literature.
Results:The deviance based criterion (DBC) is modified to define a robust and consistent model selection criterion called robust deviance based criterion (RDBC). Further, bootstrap version of RDBC is also proposed. A simulation study is performed to compare proposed model selection criterion with the existing one. It indicates that the performance of proposed criteria is compatible with the existing one. A key advantage of the proposed criterion is that it is very simple to compute.
Conclusions:The proposed model selection criterion is applied to arboreal marsupials data and model selection is carried out. The proposed criterion can be applied to data from any discipline mitigating the effect of outliers or deviation from the assumption of distribution of response. It can be implemented in any statistical software. In this article, R software is used for the computations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.