Adaptive Confidence Smoothing for Generalized Zero-Shot Learning

Atzmon, Yuval; Chechik, Gal

doi:10.1109/cvpr.2019.01194

Cited by 105 publications

(95 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To this end, our motivations and formulations focus on the GZSL settings. Previous methods which focus on GZSL, e.g., CADA-VAE [37], PREN [43], COSOMO [4] and GDAN [19], did not report the performance on ZSL evaluations. Therefore, we do not conduct extensive evaluations on ZSL but only report the performance on the most challenging dataset used in this paper, i.e., SUN, to show that our model is also effective for ZSL.…”

Section: Results Of Zslmentioning

confidence: 85%

“…To verify our proposed method, we compare it with both embedding methods: DeViSE [15], ESZSL [36], ALE [1], SAE [24], SJE [2], DEM [44]; and generative methods: f-CLSWGAN [41], GAZSL [45], cyc-CLSWGAN [14], SE [39], LisGAN [28], CADA-VAE [37], PREN [43] and COSMO [4]. The results of the compared methods are cited from the original papers 1 and the recent survey paper [42].…”

Section: Awa1mentioning

confidence: 99%

See 1 more Smart Citation

Learning Modality-Invariant Latent Representations for Generalized Zero-shot Learning

Jing

Zhu

et al. 2020

Proceedings of the 28th ACM International Conference on Multimedia

View full text Add to dashboard Cite

Recently, feature generating methods have been successfully applied to zero-shot learning (ZSL). However, most previous approaches only generate visual representations for zero-shot recognition. In fact, typical ZSL is a classic multi-modal learning protocol which consists of a visual space and a semantic space. In this paper, therefore, we present a new method which can simultaneously generate both visual representations and semantic representations so that the essential multi-modal information associated with unseen classes can be captured. Specifically, we address the most challenging issue in such a paradigm, i.e., how to handle the domain shift and thus guarantee that the learned representations are modalityinvariant. To this end, we propose two strategies: 1) leveraging the mutual information between the latent visual representations and the semantic representations; 2) maximizing the entropy of the joint distribution of the two latent representations. By leveraging the two strategies, we argue that the two modalities can be well aligned. At last, extensive experiments on five widely used datasets verify that the proposed method is able to significantly outperform previous the state-of-the-arts.

show abstract

Section: Results Of Zslmentioning

confidence: 85%

Section: Awa1mentioning

confidence: 99%

Learning Modality-Invariant Latent Representations for Generalized Zero-shot Learning

Jing

Zhu

et al. 2020

Proceedings of the 28th ACM International Conference on Multimedia

View full text Add to dashboard Cite

show abstract

“…These results are given for the data sets CUB, SUN, AWA1 and AWA2. We compare our approach with 12 leading GZSL methods, which are divided into three groups: semantic (SJE [24], ALE [25], LATEM [26], ES-ZSL [27], SYNC [12], DEVISE [2]), latent space learning (SAE [15], f-CLSWGAN [11], cycle-WGAN [3] and CADA-VAE [4]) and domain classification (CMT [6] and DAZSL [5]). The semantic group contains methods that only use the seen class visual and semantic samples to learn a transformation function from the visual to the semantic space, and classification is based on nearest neighbour classification in that semantic space.…”

Section: 4resultsmentioning

confidence: 99%

“…Our second observation is that samples from unseen classes that are visually different from any of the seen classes, tend to be projected outside the distribution of seen classes [6]. Atzmon and Chechik [5] propose a general framework that combines domain expert classifiers, such as DAP [7] for unseen classes, and LAGO for the seen classes [5]. However, this method relies on the disjoint training of both experts models, and the assumption that unseen samples are projected outside the distribution of seen classes [6].…”

Section: Introductionmentioning

confidence: 99%

Generalised Zero-Shot Learning with Domain Classification in a Joint Semantic and Visual Space

Felix

Harwood

Sasdelli

et al. 2019

2019 Digital Image Computing: Techniques and Applications (DICTA)

View full text Add to dashboard Cite

Generalised zero-shot learning (GZSL) is a classification problem where the learning stage relies on a set of seen visual classes and the inference stage aims to identify both the seen visual classes and a new set of unseen visual classes. Critically, both the learning and inference stages can leverage a semantic representation that is available for the seen and unseen classes. Most state-of-the-art GZSL approaches rely on a mapping between latent visual and semantic spaces without considering if a particular sample belongs to the set of seen or unseen classes. In this paper, we propose a novel GZSL method that learns a joint latent representation that combines both visual and semantic information. This mitigates the need for learning a mapping between the two spaces. Our method also introduces a domain classification that estimates whether a sample belongs to a seen or an unseen class. Our classifier then combines a class discriminator with this domain classifier with the goal of reducing the natural bias that GZSL approaches have toward the seen classes. Experiments show that our method achieves state-of-the-art results in terms of harmonic mean, the area under the seen and unseen curve and unseen classification accuracy on public GZSL benchmark data sets. Our code will be available upon acceptance of this paper.

show abstract

“…However, direct search within all classes cannot well utilized the knowledge learned from the seen training samples, thus some Out-of-Domain (OoD) based methods are proposed to first classify the feature as seen or unseen, and then divide the GZSL problem into two sub-tasks: a conventional ZSL task and a fully supervised learning task. For example, some OoD [7,8] based methods define two classifiers to handle the seen and unseen domains separately. However, they all neglected that OoD detection is also a binary zeroshot classification, so it is unsuitable to use two totally different models for OoD detection and zero-shot classification respectively.…”

Section: Introductionmentioning

confidence: 99%

Dual Prototype Relaxation for Generalized Zero Shot Learning

Zhang

Hu³

2021

2021 IEEE International Conference on Multimedia and Expo (ICME)

View full text Add to dashboard Cite

Generalized Zero Shot Learning (GZSL) is proposed to solve the training data missing problem by transferring the knowledge learned in seen classes to unseen classes. Many methods project the visual features into semantic space and find their nearest neighbours among the pre-defined attributes, which has achieved significant success. However, there are two problems involved in this type of methods, one is that the projection is a many-to-one mapping, which cannot maintain the diversity of features in semantic space, the other is that searching within all classes cannot well utilize the knowledge learned in the seen classes. In this paper, we propose a novel method named Dual Prototype Relaxation (DPR) by relaxing the projection from many-to-one to many-to-many. Specifically, we add noise to the semantic prototype in response to the projection of multiple features within a class, and reconstruct the visual features with the same relaxed prototype. Besides, in order to make better use of the knowledge learned in the seen classes, an Out-of-Domain (OoD) based method is employed to first classify the feature to seen or unseen domains, and then the same DPR model is applied to recognize its category within each domain. Extensive experiments on four popular datasets are conducted and the results show that our method can outperform many linear and deep state-ofthe-art methods although our method is a linear one.

show abstract

Adaptive Confidence Smoothing for Generalized Zero-Shot Learning

Cited by 105 publications

References 41 publications

Learning Modality-Invariant Latent Representations for Generalized Zero-shot Learning

Learning Modality-Invariant Latent Representations for Generalized Zero-shot Learning

Generalised Zero-Shot Learning with Domain Classification in a Joint Semantic and Visual Space

Dual Prototype Relaxation for Generalized Zero Shot Learning

Contact Info

Product

Resources

About