Ridge Regression, Hubness, and Zero-Shot Learning

Shigeto, Yutaro; Suzuki, Ikumi; Hara, Kazuo; Shimbo, Masashi; Matsumoto, Yūji

doi:10.1007/978-3-319-23528-8_9

Cited by 248 publications

(203 citation statements)

References 15 publications

Supporting

Mentioning

201

Contrasting

Order By: Relevance

“…Linear S → V is based on [22] where the authors argue that using the semantic space as the embedding space reduces the variance of the projected points and thus aggravates the hubness problem [23]. They suggest instead to project semantic class prototypes onto the visual space and to compute similarities in this space.…”

Section: Experimental Evaluation 41 Methodsmentioning

confidence: 99%

From Classical to Generalized Zero-Shot Learning: A Simple Adaptation Process

Cacheux

Borgne

Crucianu

2018

MultiMedia Modeling

View full text Add to dashboard Cite

Zero-shot learning (ZSL) is concerned with the recognition of previously unseen classes. It relies on additional semantic knowledge for which a mapping can be learned with training examples of seen classes. While classical ZSL considers the recognition performance on unseen classes only, generalized zero-shot learning (GZSL) aims at maximizing performance on both seen and unseen classes. In this paper, we propose a new process for training and evaluation in the GZSL setting; this process addresses the gap in performance between samples from unseen and seen classes by penalizing the latter, and enables to select hyper-parameters well-suited to the GZSL task. It can be applied to any existing ZSL approach and leads to a significant performance boost: the experimental evaluation shows that GZSL performance, averaged over eight state-of-the-art methods, is improved from 28.5 to 42.2 on CUB and from 28.2 to 57.1 on AwA2.

show abstract

Section: Experimental Evaluation 41 Methodsmentioning

confidence: 99%

From Classical to Generalized Zero-Shot Learning: A Simple Adaptation Process

Cacheux

Borgne

Crucianu

2018

MultiMedia Modeling

View full text Add to dashboard Cite

show abstract

“…At the zero-shot classification stage, unseen samples are projected into the semantic space and labeled by semantic attributes [5,15,16,29]. Instead of learning a visual-semantic embedding, some previous works also propose to learn a semantic-visual mapping so that the unseen samples can be represented by the seen ones [12,30]. In addition, there are also some works to learn an intermediate space shared by the visual features and semantic features [4,38,39].…”

Section: Zero-shot Learningmentioning

confidence: 99%

Leveraging the Invariant Side of Generative Zero-Shot Learning

Jing

Lü

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

314

207

View full text Add to dashboard Cite

Conventional zero-shot learning (ZSL) methods generally learn an embedding, e.g., visual-semantic mapping, to handle the unseen visual samples via an indirect manner. In this paper, we take the advantage of generative adversarial networks (GANs) and propose a novel method, named leveraging invariant side GAN (LisGAN), which can directly generate the unseen features from random noises which are conditioned by the semantic descriptions. Specifically, we train a conditional Wasserstein GANs in which the generator synthesizes fake unseen features from noises and the discriminator distinguishes the fake from real via a minimax game. Considering that one semantic description can correspond to various synthesized visual samples, and the semantic description, figuratively, is the soul of the generated features, we introduce soul samples as the invariant side of generative zero-shot learning in this paper. A soul sample is the meta-representation of one class. It visualizes the most semantically-meaningful aspects of each sample in the same category. We regularize that each generated sample (the varying side of generative ZSL) should be close to at least one soul sample (the invariant side) which has the same class label with it. At the zero-shot recognition stage, we propose to use two classifiers, which are deployed in a cascade way, to achieve a coarse-to-fine result. Experiments on five popular benchmarks verify that our proposed approach can outperform state-of-the-art methods with significant improvements 1 .

show abstract

“…The Hubness problem is defined as a few points being the nearest neighbors of most of the other points, which is caused by that projecting a visual feature with high dimensions into an attributes space with low dimensions shrinks the variance of the projected data points [49]. Therefore, a few methods [35,44,49] use an embedding space spanned by visual features, which is defined as a semantic-visual embedding. Although the previous methods are effective, insufficient semantic embedding limits their further applications due to serious domain shift problems.…”

Section: Embedding-based Zero-shot Learningmentioning

confidence: 99%

“…In most existing methods [4,35,49], ϕ is trained on S and directly adapted to T and f (·) is fixed by using pre-trained visual feature extractor.…”

Section: Problem Formulationmentioning

confidence: 99%

Domain-Specific Embedding Network for Zero-Shot Recognition

Min

Yao

Xie

et al. 2019

Proceedings of the 27th ACM International Conference on Multimedia

View full text Add to dashboard Cite

Zero-Shot Learning (ZSL) seeks to recognize a sample from either seen or unseen domain by projecting the image data and semantic labels into a joint embedding space. However, most existing methods directly adapt a well-trained projection from one domain to another, thereby ignoring the serious bias problem caused by domain differences. To address this issue, we propose a novel Domain-Specific Embedding Network (DSEN) that can apply specific projections to different domains for unbiased embedding, as well as several domain constraints. In contrast to previous methods, the DSEN decomposes the domain-shared projection function into one domaininvariant and two domain-specific sub-functions to explore the similarities and differences between two domains. To prevent the two specific projections from breaking the semantic relationship, a semantic reconstruction constraint is proposed by applying the same decoder function to them in a cycle consistency way. Furthermore, a domain division constraint is developed to directly penalize the margin between real and pseudo image features in respective seen and unseen domains, which can enlarge the inter-domain difference of visual features. Extensive experiments on four public benchmarks demonstrate the effectiveness of DSEN with an average of 9.2% improvement in terms of harmonic mean. The code is available in https://github.com/mboboGO/DSEN-for-GZSL.

show abstract

Ridge Regression, Hubness, and Zero-Shot Learning

Cited by 248 publications

References 15 publications

From Classical to Generalized Zero-Shot Learning: A Simple Adaptation Process

From Classical to Generalized Zero-Shot Learning: A Simple Adaptation Process

Leveraging the Invariant Side of Generative Zero-Shot Learning

Domain-Specific Embedding Network for Zero-Shot Recognition

Contact Info

Product

Resources

About