Modality-Dependent Cross-Media Retrieval

Wei, Yunchao; Zhao, Yao; Zhu, Zhongkui; Wei, Shikui; Xiao, Yanhui; Feng, Jiashi; Yan, Shuicheng

doi:10.1145/2775109

Cited by 76 publications

(35 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…On NUS-WIDE, the MAP value of our method is far higher compared with the previous and well-known methods. It is about 14.8%, 14.6%, 14.8% higher than that of MDCR for text query image task [19], image query text task and average scores. The reason is that the integration of the predictive labels for the testing images and testing texts can promote each other.…”

Section: Results On the Wikipedia Dataset And Nus-wide Datasetmentioning

confidence: 81%

See 1 more Smart Citation

Two-Stage Semantic Matching for Cross-Media Retrieval

Xu¹

2018

IJPE

View full text Add to dashboard Cite

With the development of information technology, there exists a large amount of multi-media data in our lives; the data is heterogeneous with low-level features while consistent with semantic information. Traditional mono-media retrieval can't cross the heterogeneous gap of multi-media data, and cross-media retrieval is arousing many researchers' interests. In this paper, we propose a two-stage semantic matching for cross-media retrieval based on support vector machines (called TSMCR). Our approach uses a combination of testing images' predictive labels and testing texts' predictive labels as the next training labels. It makes full use of semantic information of both training samples and testing samples, and the experimental results on four state-of-the-art datasets show that the TSMCR algorithm is effective.

show abstract

Section: Results On the Wikipedia Dataset And Nus-wide Datasetmentioning

confidence: 81%

“…MDCR (Modality-dependent cross-media retrieval) [19]: It belongs to Task-specific Cross-modal Retrieval (TSCR). In other words, it uses different mapping mechanisms for different cross-media retrieval tasks.…”

Section: Compared Methods and Evaluation Metricmentioning

confidence: 99%

Two-Stage Semantic Matching for Cross-Media Retrieval

Xu¹

2018

IJPE

View full text Add to dashboard Cite

show abstract

“…The second is using the same projections for all retrieval tasks (such as image→text and text→image retrieval). Instead, Wei et al [87] propose to learn different projection matrices for image→text retrieval and text→image retrieval. The idea of training different models for different tasks is also presented in the works such as [20].…”

Section: H Other Methodsmentioning

confidence: 99%

An Overview of Cross-Media Retrieval: Concepts, Methodologies, Benchmarks, and Challenges

Peng

Huang

Zhao

2018

IEEE Trans. Circuits Syst. Video Technol.

280

View full text Add to dashboard Cite

Multimedia retrieval plays an indispensable role in big data utilization. Past efforts mainly focused on single-media retrieval. However, the requirements of users are highly flexible, such as retrieving the relevant audio clips with one query of image. So challenges stemming from the "media gap", which means that representations of different media types are inconsistent, have attracted increasing attention. Cross-media retrieval is designed for the scenarios where the queries and retrieval results are of different media types. As a relatively new research topic, its concepts, methodologies and benchmarks are still not clear in the literatures. To address these issues, we review more than 100 references, give an overview including the concepts, methodologies, major challenges and open issues, as well as build up the benchmarks including datasets and experimental results. Researchers can directly adopt the benchmarks to promptly evaluate their proposed methods. This will help them to focus on algorithm design, rather than the time-consuming compared methods and results. It is noted that we have constructed a new dataset XMedia, which is the first publicly available dataset with up to five media types (text, image, video, audio and 3D model). We believe this overview will attract more researchers to focus on cross-media retrieval and be helpful to them.

show abstract

“…In a common semantic subspace, data with the same semantics are similar to each other through potential relationships. Wei et al proposed a modality-dependent cross-media retrieval method [40]. e method focuses on the retrieval direction and uses the semantic information of the query modality to project the data into the semantic space of the query modality.…”

Section: Related Workmentioning

confidence: 99%

“…Otherwise, rel k � 0; R k is the number of related items in the top k returns. To evaluate the performance of the proposed GRMD retrieval method, we compare GRMD with the canonical correlation analysis (CCA) [22], kernel canonical correlation analysis (KCCA) [19], semantic matching (SM) [22], semantic correlation matching (SCM) [22], three-view canonical correlation analysis (T-V CCA) [42], generalized multiview linear discriminant analysis (GMLDA) [29], generalized multiview canonical correlation analysis (GMMFA) [29], modalitydependent cross-media retrieval (MDCR) [40], joint feature selection and subspace learning (JFSSL) [43], joint latent subspace learning and regression (JLSLR) [44], generalized semisupervised structured subspace learning (GSSSL) [45], a cross-media retrieval algorithm based on the consistency of collaborative representation (CRCMR) [46], cross-media retrieval based on linear discriminant analysis (CRLDA) [47], and cross-modal online low-rank similarity (CMOLRS) function learning method [48]. e descriptions and characteristics of the above comparison methods used in the whole experiment are summarized in Table 2.…”

Section: Experimental Settingsmentioning

confidence: 99%

Modality-Dependent Cross-Modal Retrieval Based on Graph Regularization

Wang

Kong

et al. 2020

Mobile Information Systems

View full text Add to dashboard Cite

Nowadays, the heterogeneity gap of different modalities is the key problem for cross-modal retrieval. In order to overcome heterogeneity gaps, potential correlations of different modalities need to be mined. At the same time, the semantic information of class labels is used to reduce the semantic gaps between different modalities data and realize the interdependence and interoperability of heterogeneous data. In order to fully exploit the potential correlation of different modalities, we propose a crossmodal retrieval framework based on graph regularization and modality dependence (GRMD). Firstly, considering the potential feature correlation and semantic correlation, different projection matrices are learned for different retrieval tasks, such as image query text (I2T) or text query image (T2I). Secondly, utilizing the internal structure of original feature space constructs an adjacent graph with semantic information constraints which can make different labels of heterogeneous data closer to the corresponding semantic information. e experimental results on three widely used datasets demonstrate the effectiveness of our method.

show abstract

Modality-Dependent Cross-Media Retrieval

Cited by 76 publications

References 28 publications

Two-Stage Semantic Matching for Cross-Media Retrieval

Two-Stage Semantic Matching for Cross-Media Retrieval

An Overview of Cross-Media Retrieval: Concepts, Methodologies, Benchmarks, and Challenges

Modality-Dependent Cross-Modal Retrieval Based on Graph Regularization

Contact Info

Product

Resources

About