“…Euclidean distance in d-dimensional space), the closest clusters are merged and continues until all subjects have been merged into either the pre-specified number of clusters (k), or one cluster [ 186 ] BIRCH: Speed, scalability; CURE: Arbitrary shapes. Spectral: Performs dimensionality reduction before clustering based on the similarity matrix which describes the similarity between each pair of data points Hierarchical: Comparison of multimorbidity patterns in Hong Kong and Zurich using hierarchical agglomerative clustering [ 196 ] BIRCH: Ability to detect outlier clusters of depressed patients and polypharmacy patients not detectable using regression methods [ 197 ] CURE: CURE-SMOTE – a hybrid algorithm for feature selection, parameter optimization and synthetic minority oversampling technique (SMOTE) based on random forests [ 198 ] STING: Useful for mining of geospatial data [ 199 ] Spectral: Clustering high-dimensional data via feature selection [ 200 ] Affinity propagation: Parallel clustering algorithm for large-scale biological data sets [ 201 ] | Model-based | |
Algorithms: Gaussian Mixture Model, (GMM), Expectation–Maximisation, (EM), Dirichlet Mixture Model, (DMM), CLARANS, Self Organisng Map (SOM), Adaptive Resonance Theory, (ART) Specific features: Integrates background knowledge into gene expression, interactomes, and sequences. Models are an oversimplification since assumptions may be false and then results are inaccurate | GMM, EM, DMM, CLARANS, DBSCAN: Clustering compositional data using Dirichlet mixture model [ 185 ] |
Density-based Algorithms: Density-Based Spatial Clustering of Applications with Noise (DBSCAN) [ 202 ], Ordering Points To Identify Clustering Structure, (OPTICS), Mean-shift Specific features: DBSCAN regards clusters as dense regions of objects in space that are separated by regions of low density. |
…”