2009
DOI: 10.1145/1497577.1497581
|View full text |Cite
|
Sign up to set email alerts
|

Dolphin

Abstract: In this work a novel distance-based outlier detection algorithm, named DOLPHIN, working on disk-resident datasets and whose I/O cost corresponds to the cost of sequentially reading the input dataset file twice, is presented.It is both theoretically and empirically shown that the main memory usage of DOLPHIN amounts to a small fraction of the dataset and that DOLPHIN has linear time performance with respect to the dataset size. DOLPHIN gains efficiency by naturally merging together in a unified schema three str… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
18
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 85 publications
(18 citation statements)
references
References 26 publications
0
18
0
Order By: Relevance
“…This approach has some fundamental limitations Kriegel et al 2011]: (i) the choice of the density threshold is critical; (ii) often, it is not possible to simultaneously detect clusters of varied densities by using a single, global density threshold; and (iii) a flat clustering solution alone cannot describe possible hierarchical relationships that may exist between nested clusters lying on different density levels. Nested clusters at varied levels of density can only be described by hierarchical density-based clustering methods, such as those in Wishart [1969], Wong and Lane [1983], Ankerst et al [1999], Sander et al [2003], Brecheisen et al [2004], Chehreghani et al [2008], Stuetzle and Nugent [2010], Sun et al [2010], , and , which are able to provide more elaborated descriptions of a dataset at different degrees of granularity and resolution.…”
Section: Multi-level Mode Analysis For Clusteringmentioning
confidence: 99%
See 3 more Smart Citations
“…This approach has some fundamental limitations Kriegel et al 2011]: (i) the choice of the density threshold is critical; (ii) often, it is not possible to simultaneously detect clusters of varied densities by using a single, global density threshold; and (iii) a flat clustering solution alone cannot describe possible hierarchical relationships that may exist between nested clusters lying on different density levels. Nested clusters at varied levels of density can only be described by hierarchical density-based clustering methods, such as those in Wishart [1969], Wong and Lane [1983], Ankerst et al [1999], Sander et al [2003], Brecheisen et al [2004], Chehreghani et al [2008], Stuetzle and Nugent [2010], Sun et al [2010], , and , which are able to provide more elaborated descriptions of a dataset at different degrees of granularity and resolution.…”
Section: Multi-level Mode Analysis For Clusteringmentioning
confidence: 99%
“…The hierarchy and the cluster tree produced by the central module (HDBSCAN*) can be postprocessed for multiple tasks. For instance, they are particularly suitable for interactive data exploration, as they can be easily transformed and visualized in different ways, such as an OPTICS reachability plot [Ankerst et al 1999;Sander et al 2003], a silhouette-like plot ], a detailed dendrogram, or a compacted cluster tree. In addition, for applications that expect a nonhierarchical partition of the data, the clustering hierarchy can also be postprocessed so that a flat solution-as the best possible nonhierarchical representation of the data in some sense-can be extracted.…”
Section: Contributionsmentioning
confidence: 99%
See 2 more Smart Citations
“…For example, sensor In actual situation, the number of the sensor is very large, and the speed of the data acquisition is high, so it is must to develop a fast and efficient scheme. To handling with high dimensional data, some schemes proposed [7][8][9][10]. Most of these methods share the common view: dimensional reduction.…”
Section: Related Workmentioning
confidence: 99%