2022
DOI: 10.48550/arxiv.2210.06466
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Prompt Generation Networks for Efficient Adaptation of Frozen Vision Transformers

Abstract: Large-scale pretrained models, especially those trained from vision-language data have demonstrated the tremendous value that can be gained from both larger training datasets and models. Thus, in order to benefit from these developments, there is renewed interest in transfer learning and adapting models from large-scale general pretraining to particular downstream tasks. However, the continuously increasing size of the models means that even the classic approach of finetuning is becoming infeasible for all but… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
11
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(11 citation statements)
references
References 21 publications
0
11
0
Order By: Relevance
“…Experimentally, VPTM outperforms other visual prompt learning methods [26,1,6,38,57] with better efficiency. Extensive experiments show the consistency between pretraining and downstream visual classification contributes to the robustness against learning strategies for different datasets, prompt locations, prompt length, and prototype dimensions.…”
Section: Introductionmentioning
confidence: 93%
See 4 more Smart Citations
“…Experimentally, VPTM outperforms other visual prompt learning methods [26,1,6,38,57] with better efficiency. Extensive experiments show the consistency between pretraining and downstream visual classification contributes to the robustness against learning strategies for different datasets, prompt locations, prompt length, and prototype dimensions.…”
Section: Introductionmentioning
confidence: 93%
“…Mid. Visual prompt methods designed on discriminative pre-trained models concentrate on adding prompts to input space (VPT [26], VP [1], ILM-VP [6], EVP [57]) or learning prompt network (PGN [38]), while ignoring task consistency. Bottom.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations