Due to the high cost of data annotation in supervised learning for person re-identification (Re-ID) methods, unsupervised learning becomes more attractive in the real world. The Bottom-up Clustering (BUC) approach based on hierarchical clustering serves as one promising unsupervised clustering method. One key factor of BUC is the distance measurement strategy. Ideally, the distance measurement should consider both inter-cluster and intra-cluster distance of all samples. However, BUC uses the minimum distance, only considers a pair of the nearest sample between two clusters and ignores the diversity of other samples in clusters. To solve this problem, we propose to use the energy distance to evaluate both the inter-cluster and intra-cluster distance in hierarchical clustering(E-cluster), and use the sum of squares of deviations(SSD) as a regularization term to further balance the diversity and similarity of energy distance evaluation. We evaluate our method on large scale re-ID datasets, including Market-1501, DukeMTMC-reID and MARS. Extensive experiments show that our method obtains significant improvements over the state-of-the-art unsupervised methods, and even better than some transfer learning methods.• We measure the distance between clusters with energy distance, which can promote more compact clustering.
Unsupervised domain adaptation is a challenging task in person re-identification (re-ID). Recently, cluster-based methods achieve good performance; clustering and training are two important phases in these methods. For clustering, one major issue of existing methods is that they do not fully exploit the information in outliers by either discarding outliers in clusters or simply merging outliers. For training, existing methods only use source features for pretraining and target features for fine-tuning and do not make full use of all valuable information in source datasets and target datasets. To solve these problems, we propose a Threshold-based Hierarchical clustering method with Contrastive loss (THC). There are two features of THC: (1) it regards outliers as single-sample clusters to participate in training. It well preserves the information in outliers without setting cluster number and combines advantages of existing clustering methods; (2) it uses contrastive loss to make full use of all valuable information, including source-class centroids, target-cluster centroids and single-sample clusters, thus achieving better performance. We conduct extensive experiments on Market-1501, DukeMTMC-reID and MSMT17. Results show our method achieves state of the art.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.