2020
DOI: 10.1109/tmm.2019.2954741
|View full text |Cite
|
Sign up to set email alerts
|

Semi-Supervised Cross-Modal Retrieval With Label Prediction

Abstract: Due to abundance of data from multiple modalities, cross-modal retrieval tasks with image-text, audioimage, etc. are gaining increasing importance. Of the different approaches proposed, supervised methods usually give significant improvement over their unsupervised counterparts at the additional cost of labeling or annotation of the training data. Semi-supervised methods are recently becoming popular as they provide an elegant framework to balance the conflicting requirement of labeling cost and accuracy. In t… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 30 publications
(5 citation statements)
references
References 31 publications
0
5
0
Order By: Relevance
“…In [29], authors have presented a novel cross-modal retrieval with collective deep semantic learning (CR-CDSL) approach which makes use of two complementing deep neural networks and deep restricted Boltzmann machines are utilized for weight initialization in the neural networks. A deep semi-supervised cross-modal retrieval framework is proposed in [30] which can effectually tackle both labeled and unlabeled multi-modal data. A label prediction component is utilized in predicting labels for unlabeled training data and a shared representation is learned for the modalities.…”
Section: Deep Learning Based Methodsmentioning
confidence: 99%
“…In [29], authors have presented a novel cross-modal retrieval with collective deep semantic learning (CR-CDSL) approach which makes use of two complementing deep neural networks and deep restricted Boltzmann machines are utilized for weight initialization in the neural networks. A deep semi-supervised cross-modal retrieval framework is proposed in [30] which can effectually tackle both labeled and unlabeled multi-modal data. A label prediction component is utilized in predicting labels for unlabeled training data and a shared representation is learned for the modalities.…”
Section: Deep Learning Based Methodsmentioning
confidence: 99%
“…Lately, semi-supervised techniques are gaining popularity as they provide a better framework to balance the trade-off between annotation cost and retrieval accuracy. A novel deep semi-supervised framework is proposed in [87] to handle both annotated and un-annotated data. Firstly, an un-annotated part of training data is labeled using the label prediction component and then a common representation of both modalities is learned to perform cross-modal retrieval.…”
Section: Machine Learning and Deep Learning Based Methodsmentioning
confidence: 99%
“…Lee et al [14] chose the class with the highest prediction probability of the model as the pseudo label; however, pseudo labels are only used in the fine-tuning stage, and the network needs to be pre-trained. Mandal et al [31] proposed a new deep semi-supervised framework, which can seamlessly process marked and unlabeled data. The framework is trained by two parts in turn: firstly, the label prediction component is used to predict the label of the unlabeled part of the training data, and then the common representation of two patterns is learned for cross-modal retrieval.…”
Section: Pseudo-labelingmentioning
confidence: 99%