Entropy-Based Uncertainty Calibration for Generalized Zero-Shot Learning

Chen, Zhi; Huang, Zi; Li, Jingjing; Zhang, Zheng

doi:10.48550/arxiv.2101.03292

Cited by 3 publications

(4 citation statements)

References 67 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Semantic Rectifying GAN (SRGAN) [22] utilizes manually designed distance functions to rectify over-smoothing semantic features by visual similarities. Some embedding methods [10] and VAE-based methods [27], [28] try to utilize the triplet loss to search automatically more discriminative representations from visual features.…”

Section: A Zero-shot Learningmentioning

confidence: 99%

“…For example, one embedding ZSL method, Latent Discriminative Features Learning (LDF) [10], utilizes TL to mine new latent semantic features from visual features. In generative methods, Entropybased Uncertainty calibration VAE (EUC-VAE) [27] and Over-Complete Distribution VAE (OCD-VAE) [28] integrate TL in VAE to enhance the separability of encoded representations. EUC-VAE designs two TLs trained by visual features and semantic features, respectively.…”

Section: B Triplet Loss In Zslmentioning

confidence: 99%

“…We compare our model with the recent state-of-the-arts published in the last few years. Embedding methods include DEVISE [18] (NeurIPS13), DAP [4] (TPAMI14), SSE [43] (ICCV15), SJE [34] (CVPR15), ESZSL [51] (ICML15), ALE [6] (TPAMI16), LATEM [52] (CVPR16), SYNC [53] (CVPR16), SAE [32] (CVPR17), CRNet [55] (ICML19) and DVBE [56] (CVPR20); generative methods include GAZSL (CVPR18) [9], PSR (CVPR18) [54], f-CLSWGAN [21] (CVPR18), CDL [12] (ECCV18), SRGAN [22] (ICME19), GDAN [24] (CVPR19), DASCN [57] (NeurIPS19), AFC-GAN [58] (ACM MM19), OCD-VAE [28] (CVPR20), EUC-VAE [27] (arxiv21) and LsrGAN [25] (ECCV20). The results of ZSL and GZSL are reported in Table II.…”

Section: B Comparison To State-of-the-artsmentioning

confidence: 99%

“…To answer the first question, we notice that recent studies have designed a series of methods for automatically discriminative representation search [22], [11], [25], [12], [10] where the triplet loss is often used [26], [27], [28], [24], [10]. For example, Latent Discriminative Features Learning (LDF) [10] recognizes unseen samples in semantic and latent semantic space, which is searched by triplet loss (TL).…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Disentangling Semantic-to-Visual Confusion for Zero-Shot Learning

Lyu

et al. 2022

IEEE Trans. Multimedia

View full text Add to dashboard Cite

Using generative models to synthesize visual features from semantic distribution is one of the most popular solutions to ZSL image classification in recent years. The triplet loss (TL) is popularly used to generate realistic visual distributions from semantics by automatically searching discriminative representations. However, the traditional TL cannot search reliable unseen disentangled representations due to the unavailability of unseen classes in ZSL. To alleviate this drawback, we propose in this work a multi-modal triplet loss (MMTL) which utilizes multimodal information to search a disentangled representation space. As such, all classes can interplay which can benefit learning disentangled class representations in the searched space. Furthermore, we develop a novel model called Disentangling Class Representation Generative Adversarial Network (DCR-GAN) focusing on exploiting the disentangled representations in training, feature synthesis, and final recognition stages. Benefiting from the disentangled representations, DCR-GAN could fit a more realistic distribution over both seen and unseen features. Extensive experiments show that our proposed model can lead to superior performance to the state-of-the-arts on four benchmark datasets. Our code is available at https://github.com/FouriYe/DCRGAN-TMM.

show abstract

Section: A Zero-shot Learningmentioning

confidence: 99%

Section: B Triplet Loss In Zslmentioning

confidence: 99%

Section: B Comparison To State-of-the-artsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Disentangling Semantic-to-Visual Confusion for Zero-Shot Learning

Lyu

et al. 2022

IEEE Trans. Multimedia

View full text Add to dashboard Cite

show abstract

A Review of Generalized Zero-Shot Learning Methods

Pourpanah¹,

Abdar²,

Luo³

et al. 2020

Preprint

View full text Add to dashboard Cite

Generalized zero-shot learning (GZSL) aims to train a model for classifying data samples under the condition that some output classes are unknown during supervised learning. To address this challenging task, GZSL leverages semantic information of both seen (source) and unseen (target) classes to bridge the gap between both seen and unseen classes. Since its introduction, many GZSL models have been formulated. In this review paper, we present a comprehensive review of GZSL. Firstly, we provide an overview of GZSL including the problems and challenging issues. Then, we introduce a hierarchical categorization of the GZSL methods and discuss the representative methods of each category. In addition, we discuss several research directions for future studies.

show abstract

Mitigating Generation Shifts for Generalized Zero-Shot Learning

Chen

Luo

Wang

et al. 2021

Preprint

View full text Add to dashboard Cite

Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training. It is natural to derive generative models and hallucinate training samples for unseen classes based on the knowledge learned from the seen samples. However, most of these models suffer from the generation shifts, where the synthesized samples may drift from the real distribution of unseen data. In this paper, we propose a novel Generation Shifts Mitigating Flow framework, which is comprised of multiple conditional affine coupling layers for learning unseen data synthesis efficiently and effectively. In particular, we identify three potential problems that trigger the generation shifts, i.e., semantic inconsistency, variance decay, and structural permutation and address them respectively. First, to reinforce the correlations between the generated samples and the respective attributes, we explicitly embed the semantic information into the transformations in each of the coupling layers. Second, to recover the intrinsic variance of the synthesized unseen features, we introduce a visual perturbation strategy to diversify the intra-class variance of generated data and hereby help adjust the decision boundary of the classifier. Third, to avoid structural permutation in the semantic space, we propose a relative positioning strategy to manipulate the attribute embeddings, guiding which to fully preserve the inter-class geometric structure. Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings. Our code is available at: https://github.com/uqzhichen/GSMFlow CCS CONCEPTS• Computing methodologies → Computer vision.

show abstract

Entropy-Based Uncertainty Calibration for Generalized Zero-Shot Learning

Cited by 3 publications

References 67 publications

Disentangling Semantic-to-Visual Confusion for Zero-Shot Learning

Disentangling Semantic-to-Visual Confusion for Zero-Shot Learning

A Review of Generalized Zero-Shot Learning Methods

Mitigating Generation Shifts for Generalized Zero-Shot Learning

Contact Info

Product

Resources

About