Deep Comprehensive Correlation Mining for Image Clustering

Wu, Jian; Long, Keping; Wang, Fei; Qian, Chen; Cheng, Li; Lin, Zhouchen; Zha, Hongbin

doi:10.1109/iccv.2019.00824

Cited by 160 publications

(140 citation statements)

References 36 publications

Supporting

Mentioning

140

Contrasting

Order By: Relevance

“…(1) CIFAR-10(/100) [28]: A natural image dataset with 50,000/10,000 samples from 10(/100) classes for training and testing respectively. We adopted the clustering setup same as [24,44,7]: Using both the training and test sets (without labels) for CIFAR10/100 and STL-10, and only the training set for ImageNet-10, ImageNet-Dogs and Tiny-ImageNet; Taking the 20 super-classes of CIFAR-100 as the ground-truth.…”

Section: Methodsmentioning

confidence: 99%

“…Existing deep clustering approaches generally fall into two categories according to the training strategy: (1) Alternate training [49,46,7,44,6,50,15] and (2) Simultaneous training [23,36,35,16] Alternate training strategy usually estimates the ground-truth membership according to the pretrained or upto-date model and in return supervises the network training by the estimated information. DEC [46] initialises cluster centroids by conducting K-means [32] on pretrained image features and then fine-tunes the model to learn from the confident cluster assignments to sharpen the resulted prediction distribution.…”

Section: Related Workmentioning

confidence: 99%

“…The method in [49] jointly optimises the objectives of Auto-Encoder [2] and K-means [32] and alternately estimates cluster assignment to learn a "clustering-friendly" latent space. DAC [7], DDC [6] and DCCM [44] exploit the inter-samples relations according to the pairwise distance between the latest sample features and train the model accordingly. Whilst explicit local learning constraints on cluster assignment computed from either the pretrained or up-to-date models usually lead to a deterministic clustering solution, these approaches suffer from the problem of more severe error-propagation from the inconsistent estimations in the neighbourhoods during training.…”

Section: Related Workmentioning

confidence: 99%

“…The 1 st /2 nd best results are indicated in red/blue. The results of previous methods are taken from [44,24]. † : The best result among multiple trials.…”

Section: Evaluation Metricsmentioning

confidence: 99%

“…Recent deep clustering models either iteratively estimate cluster assignment and/or inter-sample relations which are then used as hypotheses in supervising the learning of deep neural networks [44,6,7,50], or used in conjunction with clustering constraints [24,16,23]. In ideal cases, such alternation-learning methods can approach the performance of supervised models not the least benefiting from their robustness against noisy labels [18].…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Deep Semantic Clustering by Partition Confidence Maximisation

Huang

Gong

Zhu

2020

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

145

View full text Add to dashboard Cite

By simultaneously learning visual features and data grouping, deep clustering has shown impressive ability to deal with unsupervised learning for structure analysis of high-dimensional visual data. Existing deep clustering methods typically rely on local learning constraints based on inter-sample relations and/or self-estimated pseudo labels. This is susceptible to the inevitable errors distributed in the neighbourhoods and suffers from error-propagation during training. In this work, we propose to solve this problem by learning the most confident clustering solution from all the possible separations, based on the observation that assigning samples from the same semantic categories into different clusters will reduce both the intra-cluster compactness and inter-cluster diversity, i.e. lower partition confidence. Specifically, we introduce a novel deep clustering method named PartItion Confidence mAximisation (PICA). It is established on the idea of learning the most semantically plausible data separation, in which all clusters can be mapped to the ground-truth classes one-to-one, by maximising the "global" partition confidence of clustering solution. This is realised by introducing a differentiable partition uncertainty index and its stochastic approximation as well as a principled objective loss function that minimises such index, all of which together enables a direct adoption of the conventional deep networks and mini-batch based model training. Extensive experiments on six widely-adopted clustering benchmarks demonstrate our model's performance superiority over a wide range of the state-of-the-art approaches. The code is available online.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

“…The 1 st /2 nd best results are indicated in red/blue. The results of previous methods are taken from [44,24]. † : The best result among multiple trials.…”

Section: Evaluation Metricsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Deep Semantic Clustering by Partition Confidence Maximisation

Huang

Gong

Zhu

2020

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

145

View full text Add to dashboard Cite

show abstract

Clustering one million molecular structures on GPU within seconds

Gao,

Wu,

Liao

et al. 2024

J Comput Chem

View full text Add to dashboard Cite

Structure clustering is a general but time‐consuming work in the study of life science. Up to now, most published tools do not support the clustering analysis on graphics processing unit (GPU) with root mean square deviation metric. In this work, we specially write codes to do the work. It supports multiple threads on multiple GPUs. To show the performance, we apply the program to a 33‐residue fragment in protein Pin1 WW domain mutant. The dataset contains 1,400,000 snapshots, which are extracted from an enhanced sampling simulation and distribute widely in the conformational space. Various testing results present that our program is quite efficient. Particularly, with two NVIDIA RTX4090 GPUs and single precision data type, the clustering calculation on 1 million snapshots is completed in a few seconds (including the uploading time of data from memory to GPU and neglecting the reading time from hard disk). This is hundreds of times faster than central processing unit. Our program could be a powerful tool for fast extraction of representative states of a molecule among its thousands to millions of candidate structures.

show abstract

Deep Image Clustering with Category-Style Representation

Zhao

et al. 2020

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Deep Comprehensive Correlation Mining for Image Clustering

Cited by 160 publications

References 36 publications

Deep Semantic Clustering by Partition Confidence Maximisation

Deep Semantic Clustering by Partition Confidence Maximisation

Clustering one million molecular structures on GPU within seconds

Deep Image Clustering with Category-Style Representation

Contact Info

Product

Resources

About