Proceedings of the ACM International Conference on Image and Video Retrieval 2010
DOI: 10.1145/1816041.1816091
|View full text |Cite
|
Sign up to set email alerts
|

Multi modal semantic indexing for image retrieval

Abstract: Popular image retrieval schemes generally rely only on a single mode, (either low level visual features or embedded text) for searching in multimedia databases. Many popular image collections (eg. those emerging over Internet) have associated tags, often for human consumption. A natural extension is to combine information from multiple modes for enhancing effectiveness in retrieval. In this paper, we propose two techniques: Multi-modal Latent Semantic Indexing (MMLSI) and Multi-Modal Probabilistic Latent Seman… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
22
0

Year Published

2010
2010
2019
2019

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 30 publications
(22 citation statements)
references
References 31 publications
0
22
0
Order By: Relevance
“…Specifically, we describe two direct multi-modal approaches, Multi-modal Latent Semantic Indexing (MMLSI), and Multi-modal Probabilistic Latent Semantic Analysis (MM pLSA). These recent methods [3] extend the traditional semantic analysis schemes with the help of a tensorial representation. In MMLSI the data is represented by a 3-order tensor where the first dimension is text words, second is visual words and the third is the images.…”
Section: Direct Multimodal Semantic Indexingmentioning
confidence: 99%
See 2 more Smart Citations
“…Specifically, we describe two direct multi-modal approaches, Multi-modal Latent Semantic Indexing (MMLSI), and Multi-modal Probabilistic Latent Semantic Analysis (MM pLSA). These recent methods [3] extend the traditional semantic analysis schemes with the help of a tensorial representation. In MMLSI the data is represented by a 3-order tensor where the first dimension is text words, second is visual words and the third is the images.…”
Section: Direct Multimodal Semantic Indexingmentioning
confidence: 99%
“…Single mode ('visual' as well as 'tag') methods are compared against multimodal semantic indexing scheme(concat [13] as proposed in [14]). The tensorial methods proposed in [3] is superior to the single mode counterparts as well as other possible multimodal semantic indexing schemes.…”
Section: Inputmentioning
confidence: 99%
See 1 more Smart Citation
“…Existing studies on multimodal image indexing and retrieval typically focus on techniques that can either identify a latent feature space for the image representations by fusing the multimodal feature representations, such as the Latent Semantic Indexing (LSI) [2,3], probabilistic Latent Semantic Analysis (pLSA) [10,3], and Non-negative Matrix Factorization (NMF) [1], or infer the associations among the multimodal features in order to generate a new representation for each image [7,20,9]. However, several limitations of such approaches have been identified.…”
Section: Introductionmentioning
confidence: 99%
“…Barnard et al [4] propose a translation model and a hierarchical model to represent the relationship between text and content. Several studies have attempted to use LSA technique for combining visual and textual features, including [12], [20] [ [15] and [6] who apply Probabilistic Latent Semantic Analysis for automatic image annotation or image retrieval. In the transformation model [16], text query is converted automatically into visual representations for image retrieval.…”
Section: Introductionmentioning
confidence: 99%