Correlation Filtering-Based Hashing for Fine-Grained Image Retrieval

Ma, Lei; Li, Xuan; Wu, Jinmeng; Zhang, Yao-Zhong

doi:10.1109/lsp.2020.3039755

Cited by 36 publications

(21 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The loss function commonly used in crowd counting is the MSE loss [ 41 ], which is mainly used to evaluate the variability of the data. After that, considering the local similarity between different regions due to scale variation, some researches [ 42 , 43 ] start to design different similarity preserving metrics [ 44 , 45 ] and similarity losses [ 28 , 39 ] to reduce the difference between the ground-truth and the estimated density map. DSSINet [ 29 ] designs a Dilated Multiscale Structural Similarity (DMS-SSIM) loss to learn local correlations within regions of different sizes.…”

Section: Related Workmentioning

confidence: 99%

HRANet: Hierarchical region-aware network for crowd counting

Xie

et al. 2022

Appl Intell

View full text Add to dashboard Cite

Aiming to tackle the most intractable problems of scale variation and complex backgrounds in crowd counting, we present an innovative framework called Hierarchical Region-Aware Network (HRANet) for crowd counting in this paper, which can better focus on crowd regions to accurately predict crowd density. In our implementation, first, we design a Region-Aware Module (RAM) to capture the internal differences within different regions of the feature map, thus adaptively extracting contextual features within different regions. Furthermore, we propose a Region Recalibration Module (RRM) which adopts a novel region-aware attention mechanism (RAAM) to further recalibrate the feature weights of different regions. By the integration of the above two modules, the influence of background regions can be effectively suppressed. Besides, considering the local correlations within different regions of the crowd density map, a Region Awareness Loss (RAL) is designed to reduce false identification while producing the locally consistent density map. Extensive experiments on five challenging datasets demonstrate that the proposed method significantly outperforms existing methods in terms of counting accuracy and quality of the generated density map. In addition, a series of specific experiments in crowd gathering scenes indicate that our method can be practically applied to crowd localization.

show abstract

Section: Related Workmentioning

confidence: 99%

HRANet: Hierarchical region-aware network for crowd counting

Xie

et al. 2022

Appl Intell

View full text Add to dashboard Cite

show abstract

“…One can see the detailed proofs of these results in Theorem 2 and Theorem 3, respectively. According to these results, problem (12) can be efficiently solved by Nesterov's optimal gradient method (OGM) [45]. Theorem 2.…”

Section: Algorithms Designmentioning

confidence: 99%

“…For the convenience of notations, we use C to represent the associated constraints in Eq. (12). At the iteration t, the two sequences are…”

Section: Algorithms Designmentioning

confidence: 99%

“…I N the wake of the rapid growth of multiple modalities data (e.g., image, text, and video) on social media and the internet, cross-modal information retrieval have obtained much attention for storing and searching items in the large-scale data environment. Hashing based methods [1][2][3][4][5][6][7][8][9][10][11][12][13][14] has shown their superiority in the approximate nearest neighbor retrieval task because of their low storage consumption and fast search speed. Its superiority is achieved by compact binary codes representation, which targets constructing the compact binary representation in Hamming space to preserve the semantic affinity information across modalities as much as possible.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Anchor Graph Structure Fusion Hashing for Cross-Modal Similarity Search

Wang¹,

Yang²,

Zareapoor³

et al. 2022

Preprint

View full text Add to dashboard Cite

Cross-modal hashing has been widely applied to retrieve items across modalities due to its superiority in fast computation and low storage. However, some challenges are still needed to address: (1) most existing CMH methods take graphs, which are always predefined separately in each modality, as input to model data distribution. These methods omit to consider the correlation of graph structure among multiple modalities. Besides, cross-modal retrieval results highly rely on the quality of predefined affinity graphs; (2) most existing CMH methods deal with the preservation of intra-and inter-modal affinity independently to learn the binary codes, which ignores considering the fusion affinity among multi-modalities data;(3) most existing CMH methods relax the discrete constraints to solve the optimization objective, which could significantly degrade the retrieval performance. To solve the above limitations, in this paper, we propose a novel Anchor Graph Structure Fusion Hashing (AGSFH). AGSFH constructs the anchor graph structure fusion matrix from different anchor graphs of multiple modalities with the Hadamard product, which can fully exploit the geometric property of underlying data structure across multiple modalities. Specifically, based on the anchor graph structure fusion matrix, AGSFH makes an attempt to directly learn an intrinsic anchor graph, where the structure of the intrinsic anchor graph is adaptively tuned so that the number of components of the intrinsic graph is exactly equal to the number of clusters. Based on this process, training instances can be clustered into semantic space. Besides, AGSFH preserves the anchor fusion affinity into the common binary Hamming space, capturing intrinsic similarity and structure across modalities by hash codes. Furthermore, a discrete optimization framework is designed to learn the unified binary codes across modalities. Extensive experimental results on three public social datasets demonstrate the superiority of AGSFH in cross-modal retrieval.

show abstract

“…More recent works on deep supervised hashing employ objectives based on class-wise loss [35], semantic clusterbased unary loss [36], multi task-based loss [37], list-wise loss [38], or using anchor graphs for defining the loss function and further improving the hashing performance [39]. Furthermore, an end-to-end supervised product quantization approach for information retrieval was proposed in [40], while deep discrete hashing approaches [18,41], incremental hashing methods [42] and correlation filteringbased fine-grained hashing approaches [43] have also been utilized to the same end.…”

Section: Related Workmentioning

confidence: 99%

Deep supervised hashing using quadratic spherical mutual information for efficient image retrieval

Passalis

Tefas

2021

Signal Processing: Image Communication

View full text Add to dashboard Cite

Several deep supervised hashing techniques have been proposed to allow for extracting compact and efficient neural network representations for various tasks. However, many deep supervised hashing techniques ignore several informationtheoretic aspects of the process of information retrieval, often leading to sub-optimal results. In this paper, we propose an efficient deep supervised hashing algorithm that optimizes the learned compact codes using an information-theoretic measure, the Quadratic Mutual Information (QMI). The proposed method is adapted to the needs of efficient image hashing and information retrieval leading to a novel information-theoretic measure, the Quadratic Spherical Mutual Information (QSMI). Apart from demonstrating the effectiveness of the proposed method under different scenarios and outperforming existing state-of-the-art image hashing techniques, this paper provides a structured way to model the process of information retrieval and develop novel methods adapted to the needs of different applications.

show abstract

Correlation Filtering-Based Hashing for Fine-Grained Image Retrieval

Cited by 36 publications

References 28 publications

HRANet: Hierarchical region-aware network for crowd counting

HRANet: Hierarchical region-aware network for crowd counting

Anchor Graph Structure Fusion Hashing for Cross-Modal Similarity Search

Deep supervised hashing using quadratic spherical mutual information for efficient image retrieval

Contact Info

Product

Resources

About