Pairwise constraints could enhance clustering performance in constraint-based clustering problems, especially when these pairwise constraints are informative. In this paper, a novel active learning pairwise constraint formulation algorithm would be constructed with aim to formulate informative pairwise constraints efficiently and economically. This algorithm consists of three phases: Selecting, Exploring and Consolidating. In Selecting phase, some type of unsupervised clustering algorithm is used to obtain an informative data set in terms of Shannon entropy. In Exploring phase, some type of farthest-first strategy is used to construct a series of query with aim to construct clustering skeleton set structure and informative pairwise constraints are also collected meanwhile based on the informative data set. If the number of skeleton sets equals the number of clusters, the new algorithm gets into third phase Consolidating; otherwise, it would finish. In Consolidating phase, non-skeleton points included in the informative data set are used to construct a series of query with skeleton set representative points constructed in Exploring phase. And some type of priority principle is proposed to help collect more must-link pairwise constraints. Treat the well-known MPCK-means (metric pairwise constrained K-means) as the underlying constraint-based semi-supervised clustering algorithm and data experiment comparison between this new algorithm and its counterparts would be done. Experiment outcome shows that significant improvement of this new algorithm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.