2021
DOI: 10.48550/arxiv.2104.08691
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The Power of Scale for Parameter-Efficient Prompt Tuning

Abstract: In this work, we explore "prompt tuning", a simple yet effective mechanism for learning "soft prompts" to condition frozen language models to perform specific downstream tasks. Unlike the discrete text prompts used by GPT-3, soft prompts are learned through backpropagation and can be tuned to incorporate signal from any number of labeled examples. Our end-to-end learned approach outperforms GPT-3's "few-shot" learning by a large margin. More remarkably, through ablations on model size using T5, we show that pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
396
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 198 publications
(401 citation statements)
references
References 27 publications
4
396
1
Order By: Relevance
“…While manually crafting prompts [Brown et al, 2020] [Radford et al, 2021is intuitive, creating and experimenting with these prompts takes time and experience, even experienced prompt designers may fail to manually discover optimal prompts . To automate prompt engineering, ] [Lester et al, 2021] [Zhou et al, 2021 paramerized the prompts by treating prompts as virtual tokens and perform prompting directly in the embedding space.…”
Section: Prompt Tuning Methods In Nlpmentioning
confidence: 99%
See 2 more Smart Citations
“…While manually crafting prompts [Brown et al, 2020] [Radford et al, 2021is intuitive, creating and experimenting with these prompts takes time and experience, even experienced prompt designers may fail to manually discover optimal prompts . To automate prompt engineering, ] [Lester et al, 2021] [Zhou et al, 2021 paramerized the prompts by treating prompts as virtual tokens and perform prompting directly in the embedding space.…”
Section: Prompt Tuning Methods In Nlpmentioning
confidence: 99%
“…Following [Lester et al, 2021], we initialize the class-specific prompts p c to maximize the likelihood of P (y pred = y|p c ). However, just like the part (e) in the Figure1, there are significant differences among the content of different affective images even though they are in the same class.…”
Section: Diversified Prompts Compositionmentioning
confidence: 99%
See 1 more Smart Citation
“…Different from the traditional approaches that encode the sentence into a set of vectors and then classify their sentiment through a fully connected layer, the prompt-based method will construct a set of templates, for example: ("I am always happy to see you, the sentence's sentiment is [MASK]"), and then ask the model to predict the token [mask] according to the original training task for the PLM. This approach has gone through various stages, from manual template construction [Jiang et al 2020], to automated search for discrete tokens [Shin et al 2020], to continuous virtual Tokon representations [Lester et al 2021;Li and Liang 2021]. It has achieved a great success in few-shot scenarios.…”
Section: Fine-tuningmentioning
confidence: 99%
“…The approach achieves impressive results on some generative tasks such as data-to-text. An extension of the model, namely P-tuning [Lester et al 2021], serves a similar purpose. Different from prefix-tuning [Li and Liang 2021], p-tuning does not place prompt with the "prefix" in the input, but constructs a suitable template to prompt the PLM, and the template is composed of continuous virtual token which is obtained through gradient descent.…”
Section: Fine-tuningmentioning
confidence: 99%