Problem statement: Bootstrap approach had introduced new advancement in modeling and model evaluation. It was a computer intensive method that can replace theoretical formulation with extensive use of computer. The Ordinary Least Squares (OLS) method often used to estimate the parameters of the regression models in the bootstrap procedure. Unfortunately, many statistics practitioners are not aware of the fact that the OLS method can be adversely affected by the existence of outliers. As an alternative, a robust method was put forward to overcome this problem. The existence of outliers in the original sample may create problem to the classical bootstrapping estimates. There was possibility that the bootstrap samples may contain more outliers than the original dataset, since the bootstrap re-sampling is with replacement. Consequently, the outliers will have an unduly effect on the classical bootstrap mean and standard deviation. Approach: In this study, we proposed to use a robust bootstrapping method which was less sensitive to outliers. In the robust bootstrapping procedure, we proposed to replace the classical bootstrap mean and standard deviation with robust location and robust scale estimates. A number of numerical examples were carried out to assess the performance of the proposed method. Results: The results suggested that the robust bootstrap method was more efficient than the classical bootstrap. Conclusion/Recommendations: In the presence of outliers in the dataset, we recommend using the robust bootstrap procedure as its estimates are more reliable
Forward selection (FS) is a very effective variable selection procedure for selecting a parsimonious subset of covariates from a large number of candidate covariates. Detecting the type of outlying observations, such as vertical outliers or leverage points, and the FS procedure are inseparable problems. For robust variable selection, a crucial issue is whether the outliers are univariate, bivariate, or multivariate. This paper uses a consistent robust multivariate dispersion estimator to obtain robust correlation estimators used to establish robust forward selection (RFS) procedures that outperform methods that use robust bivariate correlations. The usefulness of our proposed procedure is studied with a numerical example and a simulation study. The result shows the proposed method has scalability and the ability to deal with univariate, bivariate and multivariate outlying observations including leverage points or vertical outliers, and the new method outperforms previously published methods of RFS.
High Leverage Points (HLPs) are outlying observations in the X -directions. It is very imperative to detect HLPs because the computed values of various estimates are affected by their presence. It is now evident that Diagnostic Robust Generalized Potential which is based on the Minimum Volume Ellipsoid (DRGP(MVE)) is capable of detecting multiple HLPs. However, it takes very long computational running times. Another diagnostic measure which is based on Index Set Equality denoted as DRGP(ISE) is put forward with the main aim of reducing its running time. Nonetheless, it is computationally not stable and still suffers from masking and swamping effects. Hence, in this paper, we propose another version of diagnostic measure which is based on Reweighted Fast Consistent and High Breakdown (RFCH) estimators. We call this measure Diagnostic Robust Generalized Potential based on √n RFCH and it is denoted by DRGP(RFCH). The results of simulation study and real data indicate that our proposed method outperformed the other two methods in term of having the least computing time, highest percentage of correct detection of HLPs and smallest percentage of swamping and masking effects compared to the DRGP(MVE) and DRGP (ISE).
The main purpose of this paper is to formulate a robust correlation coefficient for high dimensional data in the presence of multivariate outliers. The proposed method is compared with the existing robust bivariate correlation based on Adjusted Winsorization data and the well-known Pearson's correlation coefficient. The performance of our proposed method is investigated using artificial data and simulation study. An important implication of these findings is that the robust correlation based on RFCH estimator is more reliable and more efficient than the existing methods in all type of contamination scenarios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.