Clustering ensemble selection has shown high efficiency in the improvement of the quality of clustering solutions. This technique comprises two important metrics: diversity and quality. It has been empirically proved that ensembles of higher effectiveness can be achieved through taking into consideration the diversity and quality simultaneously. However, the relationships between these two metrics in base clusterings have remained uncertain. This paper suggests a new hierarchical selection algorithm using a diversity/quality measure based on the jaccard similarity measure. In the proposed algorithm, the selection of the subsets of the clustering partitions is done based on their diversity measures. The proposed diversity measure (in two types of pair-wise diversity and hybrid diversity) is applied to the proposed algorithm. Hypergraph-Partitioning Algorithm (HGPA), Cluster-based Similarity Partition Algorithm (CSPA), and Meta-CLustering Algorithm (MCLA) were used to obtain the consensus solution and cluster ensemble selection results with a hierarchical method. The experimental results on 14 datasets showed that selecting a subset of base clusterings using the proposed algorithm led to more accurate results compared to those of the full ensemble. The effectiveness and robustness of the proposed algorithm were demonstrated in comparison with the full ensemble. The comparative results showed that the proposed method by new diversity measure outperformed the full ensemble.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.