Multi-view data clustering attracts more attention than their single view counterparts due to the fact that leveraging multiple independent and complementary information from multi-view feature spaces outperforms the single one. Multi-view Spectral Clustering aims at yielding the data partition agreement over their local manifold structures by seeking eigenvalue-eigenvector decompositions. Among all the methods, Low-Rank Representation (LRR) is effective, by exploring the multiview consensus structures beyond the low-rankness to boost the clustering performance. However, as we observed, such classical paradigm still suffers from the following stand-out limitations for multiview spectral clustering of (1) overlooking the flexible local manifold structure, caused by (2) aggressively enforcing the low-rank data correlation agreement among all views, such strategy therefore cannot achieve the satisfied between-views agreement; worse still, (3) LRR is not intuitively flexible to capture the latent data clustering structures. In this paper, we present the structured LRR by factorizing into the latent low-dimensional data-cluster representations, which characterize the data clustering structure for each view. Upon such representation, (b) the laplacian regularizer is imposed to be capable of preserving the flexible local manifold structure for each view. (c) We present an iterative multi-view agreement strategy by minimizing the divergence objective among all factorized latent data-cluster representations during each iteration of optimization process, where such latent representation from each view serves to regulate those from other views, such intuitive process iteratively coordinates all views to be agreeable. (d) We remark that such data-cluster representation can flexibly encode the data clustering structure from any view with adaptive input cluster number. To this end, (e) a novel non-convex objective function is proposed via the efficient alternating minimization strategy. The complexity analysis are also presented. The extensive experiments conducted against the realworld multi-view datasets demonstrate the superiority over state-of-the-arts.
More often than not, a multimedia data described by multiple features, such as color and shape features, can be naturally decomposed of multi-views. Since multi-views provide complementary information to each other, great endeavors have been dedicated by leveraging multiple views instead of a single view to achieve the better clustering performance. To effectively exploit data correlation consensus among multi-views, in this paper, we study subspace clustering for multi-view data while keeping individual views well encapsulated. For characterizing data correlations, we generate a similarity matrix in a way that high affinity values are assigned to data objects within the same subspace across views, while the correlations among data objects from distinct subspaces are minimized. Before generating this matrix, however, we should consider that multi-view data in practice might be corrupted by noise. The corrupted data will significantly downgrade clustering results. We first present a novel objective function coupled with an angular based regularizer. By minimizing this function, multiple sparse vectors are obtained for each data object as its multiple representations. In fact, these sparse vectors result from reaching data correlation consensus on all views. For tackling noise corruption, we present a sparsity-based approach that refines the angular-based data correlation. Using this approach, a more ideal data similarity matrix is generated for multi-view data. Spectral clustering is then applied to the similarity matrix to obtain the final subspace clustering. Extensive experiments have been conducted to validate the effectiveness of our proposed approach.
Video-based person re-identification (re-id) is a central application in surveillance systems with significant concern in security. Matching persons across disjoint camera views in their video fragments is inherently challenging due to the large visual variations and uncontrolled frame rates. There are two steps crucial to person re-id, namely discriminative feature learning and metric learning. However, existing approaches consider the two steps independently, and they do not make full use of the temporal and spatial information in videos. In this paper, we propose a Siamese attention architecture that jointly learns spatiotemporal video representations and their similarity metrics. The network extracts local convolutional features from regions of each frame, and enhance their discriminative capability by focusing on distinct regions when measuring the similarity with another pedestrian video. The attention mechanism is embedded into spatial gated recurrent units to selectively propagate relevant features and memorize their spatial dependencies through the network. The model essentially learns which parts (where) from which frames (when) are relevant and distinctive for matching persons and attaches higher importance therein. The proposed Siamese model is end-to-end trainable to jointly learn comparable hidden representations for paired pedestrian videos and their similarity value. Extensive experiments on three benchmark datasets show the effectiveness of each component of the proposed deep network while outperforming state-of-the-art methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.