2022
DOI: 10.48550/arxiv.2203.12119
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Visual Prompt Tuning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
40
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 25 publications
(40 citation statements)
references
References 0 publications
0
40
0
Order By: Relevance
“…Rare attention has been drawn to the field of efficient adaptation, especially in the field of vision Transformers. Inspired by Prompting in NLP, [45] introduced the learnable tokens in exploring the efficient adaptation for ViTs. We empirically found that the performance of prompting is hindered by the scale of tokens.…”
Section: Efficient Transfer Learning For Transformersmentioning
confidence: 99%
See 3 more Smart Citations
“…Rare attention has been drawn to the field of efficient adaptation, especially in the field of vision Transformers. Inspired by Prompting in NLP, [45] introduced the learnable tokens in exploring the efficient adaptation for ViTs. We empirically found that the performance of prompting is hindered by the scale of tokens.…”
Section: Efficient Transfer Learning For Transformersmentioning
confidence: 99%
“…Compared to our methods, we notice that recent prompt-related approaches insert trainable parameters into the token space, as illustrated in Figure 3. They prepend learnable parameters either into the embedded tokens before linear projection [52] or the key and value tokens after linear projection [45]. Therefore, the prompt-related method can not be straightforwardly adapted to special MHSA variants, especially for the one that takes the pyramid spatial information into account [56,73].…”
Section: Multi-head Attentionmentioning
confidence: 99%
See 2 more Smart Citations
“…Prompt tuning and other PEFT methods have also been explored outside of the context of language models (e.g. vision [22,69] and vision-and-language models [26]).…”
Section: Related Workmentioning
confidence: 99%