2022
DOI: 10.48550/arxiv.2203.03878
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both Language and Vision-and-Language Tasks

Abstract: The workflow of pretraining and fine-tuning has emerged as a popular paradigm for solving various NLP and V&L (Vision-and-Language) downstream tasks. With the capacity of pretrained models growing rapidly, how to perform parameterefficient fine-tuning has become fairly important for quick transfer learning and deployment. In this paper, we design a novel unified parameterefficient transfer learning framework that works effectively on both pure language and V&L tasks. In particular, we use a shared hypernetwork… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 14 publications
0
1
0
Order By: Relevance
“…For example, SPoT [15] suggests initializing the downstream task prompts with prompts that have been tuned on a mixture of related tasks. Meanwhile, HyperPELT [16] trains a hypernetwork that generates trainable parameters for the main model, including prompt tokens. Another approach, ATTEMPT [17], learns prompts for all the source tasks and then creates an instance-wise prompt for the target task by combining the source tasks' prompts and a newly initialized prompt using an attention block.…”
Section: Related Workmentioning
confidence: 99%
“…For example, SPoT [15] suggests initializing the downstream task prompts with prompts that have been tuned on a mixture of related tasks. Meanwhile, HyperPELT [16] trains a hypernetwork that generates trainable parameters for the main model, including prompt tokens. Another approach, ATTEMPT [17], learns prompts for all the source tasks and then creates an instance-wise prompt for the target task by combining the source tasks' prompts and a newly initialized prompt using an attention block.…”
Section: Related Workmentioning
confidence: 99%