Proceedings of the 19th ACM International Conference on Multimedia 2011
DOI: 10.1145/2072298.2071944
|View full text |Cite
|
Sign up to set email alerts
|

Efficient multi-modal retrieval in conceptual space

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 5 publications
0
4
0
Order By: Relevance
“…The sounds such as clothe chafing and footsteps emitted by people are used for human identification under the view of camera. CCA also benefits cross-modal and multi-modal retrieval for video and audio [41], in which the queries can be single-modal (image) or multi-modal (combination of image, audio, and location). Perhaps the closest work to our problem is [14], proposed by Zhang et al, which investigates the cross-modal relationship between an animal's image and its corresponding sounds.…”
Section: Related Workmentioning
confidence: 99%
“…The sounds such as clothe chafing and footsteps emitted by people are used for human identification under the view of camera. CCA also benefits cross-modal and multi-modal retrieval for video and audio [41], in which the queries can be single-modal (image) or multi-modal (combination of image, audio, and location). Perhaps the closest work to our problem is [14], proposed by Zhang et al, which investigates the cross-modal relationship between an animal's image and its corresponding sounds.…”
Section: Related Workmentioning
confidence: 99%
“…However, text-based retrieval methods only let users describe the past in concrete language. Another direction is using images or videos as user queries (Imura et al 2011;Chandrasekhar et al 2014), which involves a content-based image retrieval method (Smeulders et al 2000). However, content-based image retrieval requires images or videos including objects and specific locations about what we want to remember as queries.…”
Section: Related Workmentioning
confidence: 99%
“…The key problem in cross-modal retrieval is how to measure the similarity among different media modalities, the existing methods usually focus on a common space in which the classical measure can be directly applied. The common spaces include correlative subspace [6,7,8], semantic space [8] and hash space [12].…”
Section: Introduction • Yansheng Lumentioning
confidence: 99%
“…Rasiwasia et al [8] apply CCA to learn the subspace that maximizes the correlation between image and text. Lmura et al [7] use GCCA to simultaneously focus on the correlation among image, sound and location information. Li et al [6] introduce CFA to seek transformations that best represent the association between two different modalities.…”
Section: Introduction • Yansheng Lumentioning
confidence: 99%