Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.88
|View full text |Cite
|
Sign up to set email alerts
|

ConvFiT: Conversational Fine-Tuning of Pretrained Language Models

Abstract: Transformer-based language models (LMs) pretrained on large text collections are proven to store a wealth of semantic knowledge. However, 1) they are not effective as sentence encoders when used off-the-shelf, and 2) thus typically lag behind conversationally pretrained (e.g., via response selection) encoders on conversational tasks such as intent detection (ID). In this work, we propose CON-VFIT, a simple and efficient two-stage procedure which turns any pretrained LM into a universal conversational encoder (… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
22
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 21 publications
(22 citation statements)
references
References 52 publications
0
22
0
Order By: Relevance
“…Other work on few-shot intent classification explores fine-tuning dialogue-specific LMs as classifiers as well as using similarity-based classifiers instead of MLP-based ones on top of BERT (Vulić et al, 2021). We believe that improvements brought by data augmentation would be complementary to the gains brought by these methods.…”
Section: Gpt-3 Predictionsmentioning
confidence: 99%
“…Other work on few-shot intent classification explores fine-tuning dialogue-specific LMs as classifiers as well as using similarity-based classifiers instead of MLP-based ones on top of BERT (Vulić et al, 2021). We believe that improvements brought by data augmentation would be complementary to the gains brought by these methods.…”
Section: Gpt-3 Predictionsmentioning
confidence: 99%
“…For example, formulated intent recognition as a sentence similarity task and pre-trained on natural language inference (NLI) datasets. Vulić et al (2021); Zhang et al (2021e) pre-trained with a contrastive loss on intent detection tasks. Our multi-task pre-training method is inspired from Zhang et al (2021d) which leverages publicly available intent datasets and unlabeled data in the current domain for pre-training to improve the performance of few-shot intent detection.…”
Section: Related Workmentioning
confidence: 99%
“…Recent advances in pre-trained language models have resulted in impressive performances on open-domain text generation, such as story com-pletion (See et al, 2019;Yao et al, 2019;Fan et al, 2019;Ippolito et al, 2020), dialogue generation (Rashkin et al, 2019b;Zhang et al, 2020b;Li, 2020;Vulić et al, 2021), question generation (Cheng et al, 2021;Wang et al, 2021), and so on. For example, in dialogue generation, Zhang et al (2020b) Despite the success of generative pre-trained language models on a series of open-ended text generation tasks, they still suffer in maintaining coherence throughout multiple sentences due to the left-to-right word-by-word generation style (Fan et al, 2019;.…”
Section: Related Workmentioning
confidence: 99%