2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
DOI: 10.1109/cvpr.2017.282
|View full text |Cite
|
Sign up to set email alerts
|

Generalized Semantic Preserving Hashing for N-Label Cross-Modal Retrieval

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
66
0
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 114 publications
(67 citation statements)
references
References 19 publications
0
66
0
1
Order By: Relevance
“…I→T I→A I→V T→I T→A T→V A→I A→T A→V V→I V→T V→A Average Our FGCrossNet 0.210 0.526 0.606 0.255 0.181 0.208 0.553 0.159 0.443 0.629 0.195 0.437 0.366 MHTN [20] 0.116 0.195 0.281 0.124 0.138 0.185 0.196 0.127 0.290 0.306 0.186 0.306 0.204 ACMR [21] 0.162 0.119 0.477 0.075 0.015 0.081 0.128 0.028 0.068 0.536 0.138 0.111 0.162 JRL [22] 0.160 0.085 0.435 0.190 0.028 0.095 0.115 0.035 0.065 0.517 0.126 0.068 0.160 GSPH [23] 0.140 0.098 0.413 0.179 0.024 0.109 0.129 0.024 0.073 0.512 0.126 0.086 0.159 CMDN [24] 0.099 0.009 0.377 0.123 0.007 0.078 0.017 0.008 0.010 0.446 0.081 0.009 0.105 SCAN [25] 0.050…”
Section: Methodsunclassified
See 1 more Smart Citation
“…I→T I→A I→V T→I T→A T→V A→I A→T A→V V→I V→T V→A Average Our FGCrossNet 0.210 0.526 0.606 0.255 0.181 0.208 0.553 0.159 0.443 0.629 0.195 0.437 0.366 MHTN [20] 0.116 0.195 0.281 0.124 0.138 0.185 0.196 0.127 0.290 0.306 0.186 0.306 0.204 ACMR [21] 0.162 0.119 0.477 0.075 0.015 0.081 0.128 0.028 0.068 0.536 0.138 0.111 0.162 JRL [22] 0.160 0.085 0.435 0.190 0.028 0.095 0.115 0.035 0.065 0.517 0.126 0.068 0.160 GSPH [23] 0.140 0.098 0.413 0.179 0.024 0.109 0.129 0.024 0.073 0.512 0.126 0.086 0.159 CMDN [24] 0.099 0.009 0.377 0.123 0.007 0.078 0.017 0.008 0.010 0.446 0.081 0.009 0.105 SCAN [25] 0.050…”
Section: Methodsunclassified
“…We compare our FGCrossNet with state-of-the-art cross-media retrieval methods, including MHTN [20], ACMR [21], JRL [22], GSPH [23], CMDN [24], SCAN [25], GXN [26]. MHTN [20] learns common representations for 5 media types by transferring knowledge from single-media source domain (image) to cross-media target domain.…”
Section: Compared Methodsmentioning
confidence: 99%
“…More specifically, only one of the c entries is equal to 1 if the data is annotated with single label (e.g., L i x = [0 0 1 0 0]), and more than one entries will be equal to 1 if this data is marked with multiple labels (e.g., L j y = [1 0 1 0 1]). As suggested in [7], [24], semantic affinity matrix with embedding supervision can be efficiently utilized to learn hash codes of training instances. Accordingly, we first construct an affinity matrix…”
Section: Problem Formulationmentioning
confidence: 99%
“…Nevertheless, such unified hash code could inherently sacrifice their representation capability and scalability because it cannot guarantee the learned binary codes to be semantically discriminative for heterogeneous data representation. In addition, the majority of existing cross-modal hashing approaches mainly arXiv:1805.01963v1 [cs.CV] 4 May 2018 focus on the paired multi-modal collections, and very few works, except for [7], have been designed to handle the unpaired scenarios. Remarkably, all these approaches select the equalized hash length to characterize the multi-modal data and make them directly comparable in an isomorphic Hamming space.…”
Section: Introductionmentioning
confidence: 99%
“…After studying the base case, we consider the effect of the latent loss (L 2 ) by examining the performance of the full model, excluding the L 2 loss in Equation 6. It is found that L 2 boosts the performance of the model quite meagerly.…”
Section: Critical Analysismentioning
confidence: 99%