Proceedings of the 4th Workshop on NLP for Conversational AI 2022
DOI: 10.18653/v1/2022.nlp4convai-1.10
|View full text |Cite
|
Sign up to set email alerts
|

Knowledge Distillation Meets Few-Shot Learning: An Approach for Few-Shot Intent Classification Within and Across Domains

Abstract: Large Transformer-based natural language understanding models have achieved state-of-theart performance in dialogue systems. However, scarce labeled data for training, the large model size, and low inference speed hinder their deployment in low-resource scenarios. Few-shot learning and knowledge distillation techniques have been introduced to reduce the need for labeled data and computational resources, respectively. However, these techniques are incompatible because few-shot learning trains models using few d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 17 publications
0
7
0
Order By: Relevance
“…Recent work focuses on developing advanced techniques to guide the distillation [37,38] and its applications to practical problems, such as object detection [39] and data-free settings [40]. Knowledge distillation has been employed in FSL methods [9,13,18,34,35,41,42]. These methods typically adopt the model compression strategy, that is, under the guidance of a teacher model to build the student model.…”
Section: Knowledge Distillationmentioning
confidence: 99%
See 4 more Smart Citations
“…Recent work focuses on developing advanced techniques to guide the distillation [37,38] and its applications to practical problems, such as object detection [39] and data-free settings [40]. Knowledge distillation has been employed in FSL methods [9,13,18,34,35,41,42]. These methods typically adopt the model compression strategy, that is, under the guidance of a teacher model to build the student model.…”
Section: Knowledge Distillationmentioning
confidence: 99%
“…SKD [41] mines auxiliary self-supervised learning (SSL) signals from the limited data to learn the output-space manifold of each class. Sauer et al [18] train a prototypical teacher network on source classes and domains to pass transferable knowledge to a designed prototypical student network via knowledge distillation.…”
Section: Knowledge Distillationmentioning
confidence: 99%
See 3 more Smart Citations