Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing 2016
DOI: 10.18653/v1/d16-1089
|View full text |Cite
|
Sign up to set email alerts
|

Gaussian Visual-Linguistic Embedding for Zero-Shot Recognition

Abstract: An exciting outcome of research at the intersection of language and vision is that of zeroshot learning (ZSL). ZSL promises to scale visual recognition by borrowing distributed semantic models learned from linguistic corpora and turning them into visual recognition models. However the popular word-vector DSM embeddings are relatively impoverished in their expressivity as they model each word as a single vector point. In this paper we explore word-distribution embeddings for ZSL. We present a visual-linguistic … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
30
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 43 publications
(30 citation statements)
references
References 19 publications
0
30
0
Order By: Relevance
“…Note that existing works follow two settings. Some of them [30,18] Comparative results on ILSVRC 2012/2010 Even fewer published results on this dataset are available. Table 3 shows that our model clearly outperform the state-of-the-art alternatives by a large margin.…”
Section: Experiments On Imagenetmentioning
confidence: 99%
“…Note that existing works follow two settings. Some of them [30,18] Comparative results on ILSVRC 2012/2010 Even fewer published results on this dataset are available. Table 3 shows that our model clearly outperform the state-of-the-art alternatives by a large margin.…”
Section: Experiments On Imagenetmentioning
confidence: 99%
“…Beyond injective embedding: Similar to our motivation, some attempts have been made to go beyond the injective mapping. One approach is to design the embedding function to be stochastic and map an instance to a certain probability distribution (e.g., Gaussian) instead of a single point [43,38,39]. However, learning distributions is typically difficult/expensive and often lead to approximate solutions such as Monte Carlo sampling.…”
Section: Related Workmentioning
confidence: 99%
“…[54] proposes to combine the semantic embedding and knowledge graph with graph convolutional network [24]. An orthogonal direction is generative model [52,38], where class-conditional distribution is learned based on the Gaussian assumption.…”
Section: Related Workmentioning
confidence: 99%