Dolphin

Angiulli, Fabrizio; Fassetti, Fabio

doi:10.1145/1497577.1497581

Cited by 85 publications

(18 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This approach has some fundamental limitations Kriegel et al 2011]: (i) the choice of the density threshold is critical; (ii) often, it is not possible to simultaneously detect clusters of varied densities by using a single, global density threshold; and (iii) a flat clustering solution alone cannot describe possible hierarchical relationships that may exist between nested clusters lying on different density levels. Nested clusters at varied levels of density can only be described by hierarchical density-based clustering methods, such as those in Wishart [1969], Wong and Lane [1983], Ankerst et al [1999], Sander et al [2003], Brecheisen et al [2004], Chehreghani et al [2008], Stuetzle and Nugent [2010], Sun et al [2010], , and , which are able to provide more elaborated descriptions of a dataset at different degrees of granularity and resolution.…”

Section: Multi-level Mode Analysis For Clusteringmentioning

confidence: 99%

“…The hierarchy and the cluster tree produced by the central module (HDBSCAN*) can be postprocessed for multiple tasks. For instance, they are particularly suitable for interactive data exploration, as they can be easily transformed and visualized in different ways, such as an OPTICS reachability plot [Ankerst et al 1999;Sander et al 2003], a silhouette-like plot ], a detailed dendrogram, or a compacted cluster tree. In addition, for applications that expect a nonhierarchical partition of the data, the clustering hierarchy can also be postprocessed so that a flat solution-as the best possible nonhierarchical representation of the data in some sense-can be extracted.…”

Section: Contributionsmentioning

confidence: 99%

“…In this section, we provide an extended description and discussion of our hierarchical clustering method, HDBSCAN* , which can be seen as a conceptual and algorithmic improvement over OPTICS [Ankerst et al 1999]. …”

Section: Hierarchical Dbscan*-hdbscan*mentioning

confidence: 99%

“…This is a classic smoothing factor in density estimates whose behavior is well understood, and methods that have an analogous parameter (e.g., Ankerst et al [1999], , Pei et al [2009], and Stuetzle and Nugent [2010]) are typically robust to it.…”

Section: Conceptual Hdbscan*mentioning

confidence: 99%

See 3 more Smart Citations

Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection

Campello

Moulavi

Zimek

et al. 2015

ACM Trans. Knowl. Discov. Data

580

417

View full text Add to dashboard Cite

An integrated framework for density-based cluster analysis, outlier detection, and data visualization is introduced in this article. The main module consists of an algorithm to compute hierarchical estimates of the level sets of a density, following Hartigan's classic model of density-contour clusters and trees. Such an algorithm generalizes and improves existing density-based clustering techniques with respect to different aspects. It provides as a result a complete clustering hierarchy composed of all possible density-based clusters following the nonparametric model adopted, for an infinite range of density thresholds. The resulting hierarchy can be easily processed so as to provide multiple ways for data visualization and exploration. It can also be further postprocessed so that: (i) a normalized score of "outlierness" can be assigned to each data object, which unifies both the global and local perspectives of outliers into a single definition; and (ii) a "flat" (i.e., nonhierarchical) clustering solution composed of clusters extracted from local cuts through the cluster tree (possibly corresponding to different density thresholds) can be obtained, either in an unsupervised or in a semisupervised way. In the unsupervised scenario, the algorithm corresponding to this postprocessing module provides a global, optimal solution to the formal problem of maximizing the overall stability of the extracted clusters. If partially labeled objects or instance-level constraints are provided by the user, the algorithm can solve the problem by considering both constraints violations/satisfactions and cluster stability criteria. An asymptotic complexity analysis, both in terms of running time and memory space, is described. Experiments are reported that involve a variety of synthetic and real datasets, including comparisons with state-of-the-art, density-based clustering and (global and local) outlier detection methods.

show abstract

Section: Multi-level Mode Analysis For Clusteringmentioning

confidence: 99%

Section: Contributionsmentioning

confidence: 99%

Section: Hierarchical Dbscan*-hdbscan*mentioning

confidence: 99%

Section: Conceptual Hdbscan*mentioning

confidence: 99%

See 2 more Smart Citations

Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection

Campello

Moulavi

Zimek

et al. 2015

ACM Trans. Knowl. Discov. Data

580

417

View full text Add to dashboard Cite

show abstract

“…For example, sensor In actual situation, the number of the sensor is very large, and the speed of the data acquisition is high, so it is must to develop a fast and efficient scheme. To handling with high dimensional data, some schemes proposed [7][8][9][10]. Most of these methods share the common view: dimensional reduction.…”

Section: Related Workmentioning

confidence: 99%

Anomaly Detection Based on Regularized Vector Auto Regression in Thermal Power Plant

Wei

Wang

et al. 2015

MATEC Web of Conferences

View full text Add to dashboard Cite

Abstract. Anomaly detection has gained widespread interest especially in the industrial conditions. Contextual anomalies means that sensors of industrial equipment are interrelated and a sensor data instance called anomalous should be in a specific context. In this paper we propose a scheme for temporal sensor data monitor and anomaly detection in thermal power plant. The scheme is based on Regularized Vector Auto Regression, which is used to capture the linear interdependencies among multiple time series. The advantage is that the RVAR model does not require too much knowledge about the forces influencing a variable. The only prior knowledge needed is a list of variables which can be hypothesized to affect each other. Experimental results show that the proposed scheme is efficient compared with other methods such as SVM, BPNN and PCA.

show abstract

References

2020

SCADA Security: Machine Learning Concepts for Intrusion Detection and Prevention

View full text Add to dashboard Cite

Dolphin

Cited by 85 publications

References 26 publications

Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection

Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection

Anomaly Detection Based on Regularized Vector Auto Regression in Thermal Power Plant

References

Contact Info

Product

Resources

About