2022
DOI: 10.1109/tbdata.2019.2922969
|View full text |Cite
|
Sign up to set email alerts
|

Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams

Abstract: Change is one of the biggest challenges in dynamic stream mining. From a data-mining perspective, adapting and tracking change is desirable in order to understand how and why change has occurred. Clustering, a form of unsupervised learning, can be used to identify the underlying patterns in a stream. Density-based clustering identifies clusters as areas of high density separated by areas of low density. This paper proposes a Multi-Density Stream Clustering (MDSC) algorithm to address these two problems; the mu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
20
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 22 publications
(20 citation statements)
references
References 38 publications
0
20
0
Order By: Relevance
“…After calculating its centre c, with Equation (20), and radius r, with Equation (21), the -neighbourhood method is again used to find density reachable microclusters. Among them, a process is undertaken to detect the so-called border microclusters [35] inside C, which obviously are not present during the first iteration as C initially contains only one microcluster. Border microclusters are defined as density reachable microclusters that have a density level that is below the density threshold of the first microclusters present in C. Having a threshold that is too high, cluster C will not expand, whilst having a value that is too low, cluster C will contain dissimilar microclusters.…”
Section: Detecting and Forming New Clustersmentioning
confidence: 99%
See 2 more Smart Citations
“…After calculating its centre c, with Equation (20), and radius r, with Equation (21), the -neighbourhood method is again used to find density reachable microclusters. Among them, a process is undertaken to detect the so-called border microclusters [35] inside C, which obviously are not present during the first iteration as C initially contains only one microcluster. Border microclusters are defined as density reachable microclusters that have a density level that is below the density threshold of the first microclusters present in C. Having a threshold that is too high, cluster C will not expand, whilst having a value that is too low, cluster C will contain dissimilar microclusters.…”
Section: Detecting and Forming New Clustersmentioning
confidence: 99%
“…Border microclusters are defined as density reachable microclusters that have a density level that is below the density threshold of the first microclusters present in C. Having a threshold that is too high, cluster C will not expand, whilst having a value that is too low, cluster C will contain dissimilar microclusters. Based on the experimental data from the original paper [35], a 10% threshold yields good performance.…”
Section: Detecting and Forming New Clustersmentioning
confidence: 99%
See 1 more Smart Citation
“…A good overview on density based stream clustering is provided in [3]. More recent proposals for density-clustering include Ant Colony Stream clustering (ACSC) [19], which uses a decentralised swarm intelligence approach, CEDAS [31] and SNCStream+ [8], use a graph structure with micro-clusters as nodes, and Multi-Density Stream Clustering (MDSC) [20], which combines both online and off-line phases into a single online phase and can discover clusters with varying levels of density.…”
Section: Related Workmentioning
confidence: 99%
“…In summary, the majority of research on dynamic FS for data streams assume the supervised method [15], [33], [40], [42] and is typically used for classification tasks and not suitable for clustering. Existing stream-clustering algorithms can deal with change at the concept level (concept drift and concept evolution) [14], [19], [20], [31]. However, these methods suffer from the curse of dimensionality and are not designed to track change at the feature level.…”
Section: Related Workmentioning
confidence: 99%