2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016
DOI: 10.1109/cvpr.2016.15
|View full text |Cite
|
Sign up to set email alerts
|

Latent Embeddings for Zero-Shot Classification

Abstract: We present a novel latent embedding model for learning a compatibility function between image and class embeddings, in the context of zero-shot classification. The proposed method augments the state-of-the-art bilinear compatibility model by incorporating latent variables. Instead of learning a single bilinear map, it learns a collection of maps with the selection, of which map to use, being a latent variable for the current image-class pair. We train the model with a ranking based objective function which pen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

4
546
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 648 publications
(550 citation statements)
references
References 33 publications
(151 reference statements)
4
546
0
Order By: Relevance
“…However, these methods involve the data of unseen classes to learn the model, which to some extent breaches the strict ZSL settings. Recent work [4], [33] combines the embedding-inferring procedure into a unified framework and empirically demonstrates better performance. The closest related work is [34], which takes one-step further to synthesise classifiers for unseen classes.…”
Section: Related Workmentioning
confidence: 99%
“…However, these methods involve the data of unseen classes to learn the model, which to some extent breaches the strict ZSL settings. Recent work [4], [33] combines the embedding-inferring procedure into a unified framework and empirically demonstrates better performance. The closest related work is [34], which takes one-step further to synthesise classifiers for unseen classes.…”
Section: Related Workmentioning
confidence: 99%
“…These results are given for the data sets CUB, SUN, AWA1 and AWA2. We compare our approach with 12 leading GZSL methods, which are divided into three groups: semantic (SJE [24], ALE [25], LATEM [26], ES-ZSL [27], SYNC [12], DEVISE [2]), latent space learning (SAE [15], f-CLSWGAN [11], cycle-WGAN [3] and CADA-VAE [4]) and domain classification (CMT [6] and DAZSL [5]). The semantic group contains methods that only use the seen class visual and semantic samples to learn a transformation function from the visual to the semantic space, and classification is based on nearest neighbour classification in that semantic space.…”
Section: 4resultsmentioning
confidence: 99%
“…In addition, non-linear compatibility mapping models have also been proposed. The LATEM [6] proposes piecewise compatibility modal learning which learns nonlinear compatibility function and the CMT [15] trains a neural network with two hidden layers to learn a nonlinear mapping from image feature space to word2vec space. The DEM [5] argues that the image feature space is more discriminative than semantic space, thus it proposes an end-to-end deep embedding model which maps from semantic space into the image feature space.…”
Section: Linear and Nonlinear Embedding Modelsmentioning
confidence: 99%