2018 International Joint Conference on Neural Networks (IJCNN) 2018
DOI: 10.1109/ijcnn.2018.8489605
|View full text |Cite
|
Sign up to set email alerts
|

XGBOD: Improving Supervised Outlier Detection with Unsupervised Representation Learning

Abstract: A new semi-supervised ensemble algorithm called XGBOD (Extreme Gradient Boosting Outlier Detection) is proposed, described and demonstrated for the enhanced detection of outliers from normal observations in various practical datasets. The proposed framework combines the strengths of both supervised and unsupervised machine learning methods by creating a hybrid approach that exploits each of their individual performance capabilities in outlier detection. XGBOD uses multiple unsupervised outlier mining algorithm… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
62
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
4
1

Relationship

2
8

Authors

Journals

citations
Cited by 92 publications
(65 citation statements)
references
References 24 publications
0
62
0
Order By: Relevance
“…In this work, for multivariate data, we have compared the methodologies proposed with some multivariate outlier detection techniques. In the future, systematic experiments comparing with other well known methodologies such as XBGOD [29], LODES [30], iForest [31] or MASS [32] are to be carried out. Regarding these multivariate techniques, another interesting research line is the extension of such methodologies to functional data analysis.…”
Section: Discussionmentioning
confidence: 99%
“…In this work, for multivariate data, we have compared the methodologies proposed with some multivariate outlier detection techniques. In the future, systematic experiments comparing with other well known methodologies such as XBGOD [29], LODES [30], iForest [31] or MASS [32] are to be carried out. Regarding these multivariate techniques, another interesting research line is the extension of such methodologies to functional data analysis.…”
Section: Discussionmentioning
confidence: 99%
“…While it is possible to experimentally determine an optimal k with crossvalidation [16] when ground truth is available, a similar trivial approach does not exist in an unsupervised setting. For these reasons, we recommend setting k = 0.1n, 10% of the training samples, bounded in the range of [30,100], which yielded good results in practice.…”
Section: Local Region Definition the Local Regionmentioning
confidence: 99%
“…Classification can be performed at the individual frame level, where each frame is treated as an independent sample, or at the song level where the goal is to classify the artist corresponding to a particular song using multiple samples. The latter can be interpreted as a form of ensembling where aggregating frame level predictions and voting up to the song level can yield variance reduction; this has been an effective approach in various interdisciplinary machine learning studies [26]- [28].…”
Section: Frame Level Versus Song Level Evaluationmentioning
confidence: 99%