Proceedings of the 2009 SIAM International Conference on Data Mining 2009
DOI: 10.1137/1.9781611972795.13
|View full text |Cite
|
Sign up to set email alerts
|

Outlier Detection with Globally Optimal Exemplar-Based GMM

Abstract: Outlier detection has recently become an important problem in many data mining applications. In this paper, a novel unsupervised algorithm for outlier detection is proposed. First we apply a provably globally optimal Expectation Maximization (EM) algorithm to fit a Gaussian Mixture Model (GMM) to a given data set. In our approach, a Gaussian is centered at each data point, and hence, the estimated mixture proportions can be interpreted as probabilities of being a cluster center for all data points. The outlier… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
38
0
9

Year Published

2014
2014
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 84 publications
(47 citation statements)
references
References 29 publications
(47 reference statements)
0
38
0
9
Order By: Relevance
“…K (x m , x n ) = exp(∥x m − x n ∥ 2 /h), and the parameter h and the ridge parameter γ were set at 100 and 0.1, respectively. To measure the performance of each algorithm, we utilized the Detection rate (True positive rate), the False alarm rate (False positive rate) from the work 17 , and the ROC curves, which are defined as follows:…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…K (x m , x n ) = exp(∥x m − x n ∥ 2 /h), and the parameter h and the ridge parameter γ were set at 100 and 0.1, respectively. To measure the performance of each algorithm, we utilized the Detection rate (True positive rate), the False alarm rate (False positive rate) from the work 17 , and the ROC curves, which are defined as follows:…”
Section: Resultsmentioning
confidence: 99%
“…Along this line, He et al 16 proposed FindCBLOF to determine the Cluster-Based Local Outlier Factor (CBLOF) for each data point. Yang et al 17 had introduced a globally optimal exemplarbased GMM to detect the outliers, in which a Gaussian is centered at each point. The outlier factor at each point is calculated by the weighted sum of the mixture proportion with the weights representing the similarities to the other points.…”
Section: Introductionmentioning
confidence: 99%
“…The most straightforward outlier detection method, modelbased method, is to create a model for all samples, and then predict outliers as those having large deviations from the established profiles. For example, the Gaussian mixture model (GMM) [11] fits the whole dataset to a mixed Gaussian distribution and evaluates the parameters through the Expectation-Maximization [29] or a deep estimation network [12]. However, GMM needs to predetermine the appropriate cluster type and number, which are crucial and extremely difficult.…”
Section: Classic Outlier Detection Methodsmentioning
confidence: 99%
“…We compare MO-GAAL with nine representative outlier detection algorithms. They can be divided into seven categories: (i) two density-based methods, LOF [18] and KDEOS [38]; (ii) two density estimators, GMM [11] and Parzen [17]; (iii) a typical distance-based approach, kNN [37]; (iv) an angle-based model, FastABOD [38]; (v) a clusterbased model, k-means; (vi) a popular one-class classification model, OC-SVM [35] and (vii) the Active-Outlier detection model, AO [24]. In addition, AGPO and SO-GAAL are also compared on real-world datasets to demonstrate the necessity of using multiple generators with different objectives.…”
Section: Evaluation Measuresmentioning
confidence: 99%
See 1 more Smart Citation