2021
DOI: 10.48550/arxiv.2103.10385
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

GPT Understands, Too

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
158
0
3

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 110 publications
(161 citation statements)
references
References 0 publications
0
158
0
3
Order By: Relevance
“…∈ R l×d/N h denote the i-th head vector. Prompt-tuning (Lester et al, 2021) simplifies prefix-tuning by only prepending to the input word embeddings in the first layer; similar work also includes P-tuning (Liu et al, 2021b).…”
Section: Overview Of Previous Parameter-efficient Tuning Methodsmentioning
confidence: 99%
“…∈ R l×d/N h denote the i-th head vector. Prompt-tuning (Lester et al, 2021) simplifies prefix-tuning by only prepending to the input word embeddings in the first layer; similar work also includes P-tuning (Liu et al, 2021b).…”
Section: Overview Of Previous Parameter-efficient Tuning Methodsmentioning
confidence: 99%
“…We suspect that effective music audio generation necessitates intermediate representations that would also contain useful information for MIR. This hypothesis is further motivated by an abundance of previous work in NLP suggesting that generative and selfsupervised pre-training can yield powerful representations for discriminative tasks [22][23][24][25].…”
Section: Calm Pre-trainingmentioning
confidence: 99%
“…Methods [25,61] have been proposed to automate the prompt engineering process. The prompting process does not tune any of the parameters, which is empirically sub-optimal compared to fine-tuning [43].…”
Section: Related Workmentioning
confidence: 99%
“…For few-shot scenario, [20] proves that prompt tuning can be much better than traditional fine-tuning. When the training data is sufficient, prompt tuning performs slightly worse than fine-tuning [43]. However, the performance gap from full-model fine-tuning closes up as the pre-trained model gets larger [33,42].…”
Section: Related Workmentioning
confidence: 99%