2019
DOI: 10.1109/access.2019.2940055
|View full text |Cite
|
Sign up to set email alerts
|

An Efficient Approach for Geo-Multimedia Cross-Modal Retrieval

Abstract: Due to the rapid development of mobile Internet techniques, cloud computation and popularity of online social networking and location-based services, massive amount of multimedia data with geographical information is generated and uploaded to the Internet. In this paper, we propose a novel type of cross-modal multimedia retrieval called geo-multimedia cross-modal retrieval which aims to search out a set of geo-multimedia objects based on geographical distance proximity and semantic similarity between different… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1

Relationship

3
5

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 87 publications
0
4
0
Order By: Relevance
“…An overview of our method MMACMR is depicted in Figure 2. Following prevailing solutions [29,49], the backbone of MMACMR comprises an image encoder f v (•; θ v ) and a recipe encoder f r (•; θ r ) which project food images and recipes into a common feature subspace. In this subspace, the cross-modal features can be aligned effectively so that the similarity between images and recipes can be measured with accuracy.…”
Section: Framework Overviewmentioning
confidence: 99%
See 1 more Smart Citation
“…An overview of our method MMACMR is depicted in Figure 2. Following prevailing solutions [29,49], the backbone of MMACMR comprises an image encoder f v (•; θ v ) and a recipe encoder f r (•; θ r ) which project food images and recipes into a common feature subspace. In this subspace, the cross-modal features can be aligned effectively so that the similarity between images and recipes can be measured with accuracy.…”
Section: Framework Overviewmentioning
confidence: 99%
“…In line with prior research [49], we use food images with a depth of three channels in the RGB color space. All the images in our experiments are resized to 256 pixels in their shorter dimension and then cropped to 224 × 224 pixels.…”
Section: Implementation Detailsmentioning
confidence: 99%
“…Cross-modal retrieval is a significant problem in the area of multimedia computing [19][20][21][22][23][24][25][26], which aims to find out the similar enough objects of one modality in the multimedia database by a query of different modality. Due to the exponential growth of amount of multimedia data, this task attracts a large number of attentions in recent years.…”
Section: Cross-modal Retrievalmentioning
confidence: 99%
“…The other challenge is inefficient index and retrieval algorithm in massive geo-multimedia database. To overcome this problem, in Reference [41], a novel hybrid index called GMR-Tree is developed that is an extension of R-Tree by integrating cross-modal representation. However, this work ignores the cross-modal hashing representation that can enhance the search efficiency significantly.…”
Section: Introductionmentioning
confidence: 99%