2019
DOI: 10.1080/02664763.2019.1671961
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of robust outlier detection methods for zero-inflated complex data

Abstract: Outlier detection can be seen as a pre-processing step for locating data points in a data sample, which do not conform to the majority of observations. Various techniques and methods for outlier detection can be found in the literature dealing with different types of data. However, many data sets are inflated by true zeros and, in addition, some components/variables might be of compositional nature. Important examples of such data sets are the Structural Earnings Survey, the Structural Business Statistics, the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 33 publications
(14 citation statements)
references
References 49 publications
0
14
0
Order By: Relevance
“…Cognitive scores were screened and individuals with outliers in any domain were excluded based on Interquartile Range (IQR): criteria for exclusion were values below Q1-1.5 IQR or above Q3 + 1.5 IQR, box-and-whisker plots were used 78 , 79 (also known as Tukey’s method 80 , 81 ). After removing individuals with outliers the final sample consisted of 70 SZH (18 females, aged between 19 and 65, mean age 37.56, see Table 1 ) and 72 age and gender-matched healthy controls (18 females, aged between 18 and 65, mean age 38.44, see Table 1 ).…”
Section: Methodsmentioning
confidence: 99%
“…Cognitive scores were screened and individuals with outliers in any domain were excluded based on Interquartile Range (IQR): criteria for exclusion were values below Q1-1.5 IQR or above Q3 + 1.5 IQR, box-and-whisker plots were used 78 , 79 (also known as Tukey’s method 80 , 81 ). After removing individuals with outliers the final sample consisted of 70 SZH (18 females, aged between 19 and 65, mean age 37.56, see Table 1 ) and 72 age and gender-matched healthy controls (18 females, aged between 18 and 65, mean age 38.44, see Table 1 ).…”
Section: Methodsmentioning
confidence: 99%
“…The distributions for BMI and ED behaviors were positively skewed. As height and weight values were self‐reported, we used the Median Absolute Deviation (MAD) method (Leys, Ley, Klein, Bernard, & Licata, 2013), with a conservative cutoff of ±3 times the MAD to detect outliers (Templ, Gussenbauer, & Filzmoser, 2019). Based on the MAD cutoffs, 2.7 and 1.8% of weight and height values were considered implausible and considered missing.…”
Section: Methodsmentioning
confidence: 99%
“…Moreover, to overcome the problem of outliers and influential observations, we recalibrate sample weights following the approach proposed by Alfons et al . (2013) and generally adopted by those working with income variables (Alfons and Templ, 2013; Brzesinki, 2016; Jenkins, 2017; Safari et al ., 2018, 2019; Templ et al ., 2019). This procedure consists of detecting outlier observations against a fitted Pareto distribution of the variable of interest, applying Van Kerm’s rule of thumb to determine the threshold (Van Kerm, 2007).…”
Section: Estimation Strategymentioning
confidence: 99%