Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.172
|View full text |Cite
|
Sign up to set email alerts
|

On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation

Abstract: Adapter-based tuning has recently arisen as an alternative to fine-tuning. It works by adding light-weight adapter modules to a pretrained language model (PrLM) and only updating the parameters of adapter modules when learning on a downstream task. As such, it adds only a few trainable parameters per new task, allowing a high degree of parameter sharing. Prior studies have shown that adapter-based tuning often achieves comparable results to finetuning. However, existing work only focuses on the parameter-effic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
69
2

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 70 publications
(75 citation statements)
references
References 33 publications
4
69
2
Order By: Relevance
“…The RoBERTa-large MNLI results of our adapter implementation is on par with the recent stateof-the-art Compacter adapters on T5 (Mahabadi et al, 2021), but generalization in both BERT and RoBERTa is overall worse than with vanilla finetuning. Following on the recent report of adapter efficacy in low-resource setting (He et al, 2021), we conducted an additional experiment with adapters and RoBERTa-large, where the model had to learn from a small, more informative subsample. At 1024 training examples adapters performed better when the MNLI subsample was diverse (selected with K-means-based clustering, see appendix D) rather than randomly selected: 80.7% vs 85%.…”
Section: Negative Resultsmentioning
confidence: 99%
“…The RoBERTa-large MNLI results of our adapter implementation is on par with the recent stateof-the-art Compacter adapters on T5 (Mahabadi et al, 2021), but generalization in both BERT and RoBERTa is overall worse than with vanilla finetuning. Following on the recent report of adapter efficacy in low-resource setting (He et al, 2021), we conducted an additional experiment with adapters and RoBERTa-large, where the model had to learn from a small, more informative subsample. At 1024 training examples adapters performed better when the MNLI subsample was diverse (selected with K-means-based clustering, see appendix D) rather than randomly selected: 80.7% vs 85%.…”
Section: Negative Resultsmentioning
confidence: 99%
“…Adapter-tuning has shown to be on par with fine-tuning and sometimes exhibits better effectiveness in the low-resource setting (He et al, 2021). Later studies extend adapter-tuning to multi-lingual (Pfeiffer et al, 2021) andmulti-task (Karimi Mahabadi et al, 2021) settings, or further reduce the trainable parameters , which can be easily incorporated into UNIPELT as a replacement of the vanilla adapter-tuning.…”
Section: Pelt Methods W/ Additional Parametersmentioning
confidence: 99%
“…We conduct extensive experiments on the General Language Understanding Evaluation (GLUE) benchmark (Wang et al, 2019), which involves four types of natural language understanding tasks including linguistic acceptability (CoLA), sentiment analysis (SST-2), similarity and paraphrase tasks (MRPC, STS-B, QQP), and natural language inference (MNLI, QNLI, RTE). WNLI is omitted following prior studies (Houlsby et al, 2019;Devlin et al, 2019;He et al, 2021;Ben Zaken et al, 2021) due to its adversarial nature. Data Setup.…”
Section: Experiments Setupmentioning
confidence: 99%
See 2 more Smart Citations