Christophe Ley scite author profile

Researchers often lack knowledge about how to deal with outliers when analyzing their data. Even more frequently, researchers do not pre-specify how they plan to manage outliers. In this paper we aim to improve research practices by outlining what you need to know about outliers. We start by providing a functional definition of outliers. We then lay down an appropriate nomenclature/classification of outliers. This nomenclature is used to understand what kinds of outliers can be encountered and serves as a guideline to make appropriate decisions regarding the conservation, deletion, or recoding of outliers. These decisions might impact the validity of statistical inferences as well as the reproducibility of our experiments. To be able to make informed decisions about outliers you first need proper detection tools. We remind readers why the most common outlier detection methods are problematic and recommend the use of the median absolute deviation to detect univariate outliers, and of the Mahalanobis-MCD distance to detect multivariate outliers. An R package was created that can be used to easily perform these detection tests. Finally, we promote the use of pre-registration to avoid flexibility in data analysis when handling outliers.

show abstract

Detecting multivariate outliers: Use a robust variant of the Mahalanobis distance

Leys

Klein

Dominicy

et al. 2018

Journal of Experimental Social Psychology

340

161

View full text Add to dashboard Cite

Stein’s method for comparison of univariate distributions

Ley¹,

Reinert²,

Swan³

2017

Probab. Surveys

115

125

View full text Add to dashboard Cite

We propose a new general version of Stein's method for univariate distributions. In particular we propose a canonical definition of the Stein operator of a probability distribution which is based on a linear difference or differential-type operator. The resulting Stein identity highlights the unifying theme behind the literature on Stein's method (both for continuous and discrete distributions). Viewing the Stein operator as an operator acting on pairs of functions, we provide an extensive toolkit for distributional comparisons. Several abstract approximation theorems are provided. Our approach is illustrated for comparison of several pairs of distributions : normal vs normal, sums of independent Rademacher vs normal, normal vs Student, and maximum of random variables vs exponential, Fréchet and Gumbel.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Christophe Ley

Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median

How to Classify, Detect, and Manage Univariate and Multivariate Outliers, With Emphasis on Pre-Registration

Detecting multivariate outliers: Use a robust variant of the Mahalanobis distance

Stein’s method for comparison of univariate distributions

Contact Info

Product

Resources

About