19th IEEE International Conference on Tools With Artificial Intelligence(ICTAI 2007) 2007
DOI: 10.1109/ictai.2007.125
|View full text |Cite
|
Sign up to set email alerts
|

A Scalable and Efficient Outlier Detection Strategy for Categorical Data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
54
0

Year Published

2009
2009
2019
2019

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 58 publications
(54 citation statements)
references
References 15 publications
0
54
0
Order By: Relevance
“…Koufakou et al (2007) experimented with categorical outlier detection approaches, and proposed AVF (attribute value frequency), a simple, fast, and scalable method for categorical sets. Koufakou et al (2008a) proposed a parallel AVF method using the MapReduce paradigm of parallel programming (Dean and Ghemawat 2004), which ensures fault tolerance and load balancing.…”
Section: Related Workmentioning
confidence: 99%
“…Koufakou et al (2007) experimented with categorical outlier detection approaches, and proposed AVF (attribute value frequency), a simple, fast, and scalable method for categorical sets. Koufakou et al (2008a) proposed a parallel AVF method using the MapReduce paradigm of parallel programming (Dean and Ghemawat 2004), which ensures fault tolerance and load balancing.…”
Section: Related Workmentioning
confidence: 99%
“…Most existing categorical data oriented methods are based on a general assumption that anomalies lie in regions of low frequency (Akoglu et al, 2012;Ghoting, Otey, & Parthasarathy, 2004;He et al, 2005;Koufakou, Ortiz, Georgiopoulos, Anagnostopoulos, & Reynolds, 2007;Koufakou & Georgiopoulos, 2010;Smets & Vreeken, 2011;He, Deng, Xu, & Huang, 2006). Typical examples are frequent patterns based methods FPOF (He et al, 2005) and infrequent patterns based methods LOADED (Ghoting et al, 2004).…”
Section: Methods For Categorical Datamentioning
confidence: 99%
“…FPOF is a state-of-the-art frequency-based method for categorical data. FPOF was selected as the frequency-based contender over another representative method LOADED (Ghoting et al, 2004) because, as reported in (Koufakou et al, 2007;Wu & Wang, 2013), FPOF performs more effectively than LOADED in a range of real-world data sets, and it also has lower time complexity than LOADED. COMPREX was selected as a contender because it is related to ZERO++ in that both of them operate in a set of subspaces and it is a recently proposed state-of-the-art anomaly detector for categorical data.…”
Section: Contenders and Their Parameter Settingsmentioning
confidence: 99%
“…Similarly it was done for the RandomK strategy: we averaged its results over 30 runs. Finally, LOF was launched using four different values (10,20,30,40) of the k parameter (the number of neighbors).…”
Section: Methodsmentioning
confidence: 99%
“…In [19] a greedy algorithm is presented and adopts a principle based on entropy-change after instances removal. [20] proposes a method that assigns a score to each attribute-value pair based on its frequency. Objects with infrequent attribute values are candidate outliers.…”
Section: Anomaly Detection In Categorical Domainsmentioning
confidence: 99%