2016
DOI: 10.1109/tcsvt.2015.2400779
|View full text |Cite
|
Sign up to set email alerts
|

Semi-Supervised Cross-Media Feature Learning With Unified Patch Graph Regularization

Abstract: With the rapid growth of multimedia data such as text, image, video, audio and 3D model, cross-media retrieval has become increasingly important, because users can retrieve the results with various types of media by submitting a query of any media type. Comparing with single-media retrieval such as image retrieval and text retrieval, cross-media retrieval is better because it provides the retrieval results with all kinds of media at the same time. In this paper, we focus on how to learn cross-media features fo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
64
0
1

Year Published

2017
2017
2021
2021

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 133 publications
(65 citation statements)
references
References 33 publications
0
64
0
1
Order By: Relevance
“…Furthermore, Joint Representation Learning (JRL) [9] is proposed to construct several separate graphs for different modalities and learn projection matrices with the joint consideration of correlation and semantic information. Peng et al [24] further improve the previous works [9], [23] by constructing a unified hypergraph to learn the common space for up to five modalities, which also utilizes the fine-grained information at the same time. Besides, Wang et al [10] also adopt multimodal graph regularization term to preserve inter-modality and intra-modality similarity relationships.…”
Section: A Common Space Learning For Cross-modal Retrievalmentioning
confidence: 99%
“…Furthermore, Joint Representation Learning (JRL) [9] is proposed to construct several separate graphs for different modalities and learn projection matrices with the joint consideration of correlation and semantic information. Peng et al [24] further improve the previous works [9], [23] by constructing a unified hypergraph to learn the common space for up to five modalities, which also utilizes the fine-grained information at the same time. Besides, Wang et al [10] also adopt multimodal graph regularization term to preserve inter-modality and intra-modality similarity relationships.…”
Section: A Common Space Learning For Cross-modal Retrievalmentioning
confidence: 99%
“…Semi-supervised methods Zhai et al [2014] Peng et al [2016] Zhang et al [2018 try to bridge the gap between the supervised and unsupervised methods by utilizing both labeled as well as unlabeled data for training, and it is much less explored compared to the others. Zhai et al [2014] establishes an unified optimization framework to jointly associate the relationship between the correlation and the semantic information of the training examples.…”
Section: Related Workmentioning
confidence: 99%
“…To overcome this limitation, researchers have also attempted to develop datasets and methods for scenarios with more media types. For example, the newly constructed XMedia dataset (http://www.icst.pku.edu.cn/mipl/XMedia) is the first dataset containing five media types (text, image, video, audio, and 3D model), and methods such as those proposed by Zhai et al (2014) and Peng et al (2016b) can jointly model the correlations and semantic information in a unified framework with graph regularization for the five media types on the XMedia dataset. Yang et al (2008) introduced another model called the multimedia document (MMD) to represent data, where each MMD is a set of media objects of different modalities but carrying the same semantics.…”
Section: Theory and Model For Cross-media Uniform Representationmentioning
confidence: 99%
“…As discussed in Pan (2016), cross-media intelligence plays the role of a cornerstone in artificial intelligence, through which the machines can recognize the external environment. Although considerable improvement has been made in the research of crossmedia analysis and reasoning (Rasiwasia et al, 2010;Yang et al, 2012;Peng et al, 2016a;2016b), there remain some important challenges and unclear points in future research directions. In this paper, we give a comprehensive overview of not only the advances achieved by existing studies, but also future directions for cross-media analysis and reasoning.…”
Section: Introductionmentioning
confidence: 99%