Knowledge Distillation Meets Few-Shot Learning: An Approach for Few-Shot Intent Classification Within and Across Domains

Sauer, Anna; Asaadi, Shima; Küch, Fabian

doi:10.18653/v1/2022.nlp4convai-1.10

Cited by 9 publications

(7 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Recent work focuses on developing advanced techniques to guide the distillation [37,38] and its applications to practical problems, such as object detection [39] and data-free settings [40]. Knowledge distillation has been employed in FSL methods [9,13,18,34,35,41,42]. These methods typically adopt the model compression strategy, that is, under the guidance of a teacher model to build the student model.…”

Section: Knowledge Distillationmentioning

confidence: 99%

“…SKD [41] mines auxiliary self-supervised learning (SSL) signals from the limited data to learn the output-space manifold of each class. Sauer et al [18] train a prototypical teacher network on source classes and domains to pass transferable knowledge to a designed prototypical student network via knowledge distillation.…”

Section: Knowledge Distillationmentioning

confidence: 99%

“…Knowledge distillation has been employed in FSL methods [9, 13, 18, 34, 35, 41, 42]. These methods typically adopt the model compression strategy, that is, under the guidance of a teacher model to build the student model.…”

Section: Related Workmentioning

confidence: 99%

“…Sauer et al. [18] train a prototypical teacher network on source classes and domains to pass transferable knowledge to a designed prototypical student network via knowledge distillation.…”

Section: Related Workmentioning

confidence: 99%

“…The classification accuracies shown at the bottom‐right corners are obtained in the ablation experiments on miniImageNet. (a) Only support and base features are applied and IER [18] is used as the baseline. (b) Pseudo‐labelled examples are included to finetune a task‐oriented embedding model.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Task‐oriented feature hallucination for few‐shot image classification

Gao

Xiao-peng

2023

IET Image Processing

View full text Add to dashboard Cite

Data hallucination generates additional training examples for novel classes to alleviate the data scarcity problem in few‐shot learning (FSL). Existing hallucination‐based FSL methods normally train a general embedding model first by applying information extracted from base classes that have abundant data. In those methods, hallucinators are then built upon the trained embedding model to generate data for novel classes. However, these hallucination methods usually rely on general‐purpose embeddings, limiting their ability to generate task‐oriented samples for novel classes. Recent studies have shown that task‐specific embedding models, which are adapted to novel tasks, can achieve better classification performance. To improve the performance of example hallucination for tasks, a task‐oriented embedding model is used in the proposed method to perform task‐oriented generation. After the initialization, the hallucinator is finetuned by applying a task‐oriented embedding model with the guidance of a teacher–student mechanism. The proposed task‐oriented hallucination method contains two steps. An initial embedding network and an initial hallucinator are trained with a base dataset in the first step. The second step contains a pseudo‐labelling process where the base dataset is pseudo‐labelled using support data of the few‐shot task and a task‐oriented fine‐tuning process where the embedding network and hallucinator are adjusted simultaneously. Both the embedding network and the hallucinator are updated with the support set and the pseudo‐labelled base dataset using knowledge distillation. The experiments are conducted on four popular few‐shot datasets. The results demonstrate that the proposed approach outperforms state‐of‐the‐art methods with 0.8% to 4.08% increases in classification accuracy for 5‐way 5‐shot tasks. It also achieves comparable accuracy to state‐of‐the‐art methods for 5‐way 1‐shot tasks.

show abstract

Section: Knowledge Distillationmentioning

confidence: 99%

Section: Knowledge Distillationmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

“…Sauer et al. [18] train a prototypical teacher network on source classes and domains to pass transferable knowledge to a designed prototypical student network via knowledge distillation.…”

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Task‐oriented feature hallucination for few‐shot image classification

Gao

Xiao-peng

2023

IET Image Processing

View full text Add to dashboard Cite

show abstract

Neural models for semantic analysis of handwritten document images

Tüselmann,

Fink

2024

IJDAR

View full text Add to dashboard Cite

Semantic analysis of handwritten document images offers a wide range of practical application scenarios. A sequential combination of handwritten text recognition (HTR) and a task-specific natural language processing system offers an intuitive solution in this domain. However, this HTR-based approach suffers from the problem of error propagation. An HTR-free model, which avoids explicit text recognition and solves the task end-to-end, tackles this problem, but often produces poor results. A possible reason for this is that it does not incorporate largely pre-trained semantic word embeddings, which turn out to be one of the most powerful advantages in the textual domain. In this work, we propose an HTR-based and an HTR-free model and compare them on a variety of segmentation-based handwritten document image benchmarks including semantic word spotting, named entity recognition, and question answering. Furthermore, we propose a cross-modal knowledge distillation approach to integrate semantic knowledge from textually pre-trained word embeddings into HTR-free models. In a series of experiments, we investigate optimization strategies for robust semantic word image representation. We show that the incorporation of semantic knowledge is beneficial for HTR-free approaches in achieving state-of-the-art results on a variety of benchmarks.

show abstract

SEML: Self-Supervised Information-Enhanced Meta-learning for Few-Shot Text Classification

Huang

et al. 2023

Int J Comput Intell Syst

View full text Add to dashboard Cite

Training a deep-learning text classification model usually requires a large amount of labeled data, yet labeling data are usually labor-intensive and time-consuming. Few-shot text classification focuses on predicting unknown samples using only a few labeled samples. Recently, metric-based meta-learning methods have achieved promising results in few-shot text classification. They use episodic training in labeled samples to enhance the model’s generalization ability. However, existing models only focus on learning from a few labeled samples but neglect to learn from a large number of unlabeled samples. In this paper, we exploit the knowledge learned by the model in unlabeled samples to improve the generalization performance of the meta-network. Specifically, we introduce a novel knowledge distillation method that expands and enriches the meta-learning representation with self-supervised information. Meanwhile, we design a graph aggregation method that efficiently interacts the query set information with the support set information in each task and outputs a more discriminative representation. We conducted experiments on three public few-shot text classification datasets. The experimental results show that our model performs better than the state-of-the-art models in 5-way 1-shot and 5-way 5-shot cases.

show abstract

Knowledge Distillation Meets Few-Shot Learning: An Approach for Few-Shot Intent Classification Within and Across Domains

Cited by 9 publications

References 17 publications

Task‐oriented feature hallucination for few‐shot image classification

Task‐oriented feature hallucination for few‐shot image classification

Neural models for semantic analysis of handwritten document images

SEML: Self-Supervised Information-Enhanced Meta-learning for Few-Shot Text Classification

Contact Info

Product

Resources

About