Proceedings of the 22nd ACM International Conference on Multimedia 2014
DOI: 10.1145/2647868.2654902
|View full text |Cite
|
Sign up to set email alerts
|

Cross-modal Retrieval with Correspondence Autoencoder

Abstract: The problem of cross-modal retrieval, e.g., using a text query to search for images and vice-versa, is considered in this paper. A novel model involving correspondence autoencoder (Corr-AE) is proposed here for solving this problem. The model is constructed by correlating hidden representations of two uni-modal autoencoders. A novel optimal objective, which minimizes a linear combination of representation learning errors for each modality and correlation learning error between hidden representations of two mod… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

2
436
0
4

Year Published

2017
2017
2019
2019

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 523 publications
(442 citation statements)
references
References 20 publications
2
436
0
4
Order By: Relevance
“…Existing methods [5,9,10] mostly combine inter-media constraints (such as correlation constraints [10]) and intra-media constraints (such as semantic [5] or reconstruction constraints [9]) to train their models for building common representations. Since inter-media and intra-media constraints both need to be optimized as objective functions, there is a complex optimization problem limiting the performance of cross-media retrieval.…”
Section: Residual Correlation Learningmentioning
confidence: 99%
See 4 more Smart Citations
“…Existing methods [5,9,10] mostly combine inter-media constraints (such as correlation constraints [10]) and intra-media constraints (such as semantic [5] or reconstruction constraints [9]) to train their models for building common representations. Since inter-media and intra-media constraints both need to be optimized as objective functions, there is a complex optimization problem limiting the performance of cross-media retrieval.…”
Section: Residual Correlation Learningmentioning
confidence: 99%
“…Instead of designing stacked nonlinear layers to approximate f c (x) by a correlation constraint as [10], we design several stacked nonlinear layers to approximate the residual function �f (x) = f c (x) − f s (x). The process of f s (x) + �f (x) is realized by a shortcut connection and an element-wise addition, so that the residual function is parameterized by residual layers.…”
Section: Fig 1 An Overview Of Our Residual Correlation Network (Rcn)mentioning
confidence: 99%
See 3 more Smart Citations