2021
DOI: 10.1007/978-3-030-80599-9_2
|View full text |Cite
|
Sign up to set email alerts
|

Scaling Federated Learning for Fine-Tuning of Large Language Models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 20 publications
(8 citation statements)
references
References 4 publications
0
8
0
Order By: Relevance
“…Comparison of machine learning models with salient features is given in Table I. Federated Learning Network (FLN) has also been adopted to orchestrate various mobile devices across the world for training language models with BERT [7]. All such mobile devices are owned by different users and then connected to multiple types of links such as WiFi, mobile network, etc.…”
Section: Federated Model and Critical Analysismentioning
confidence: 99%
“…Comparison of machine learning models with salient features is given in Table I. Federated Learning Network (FLN) has also been adopted to orchestrate various mobile devices across the world for training language models with BERT [7]. All such mobile devices are owned by different users and then connected to multiple types of links such as WiFi, mobile network, etc.…”
Section: Federated Model and Critical Analysismentioning
confidence: 99%
“…Specifically, we examine cross-device FL (Kairouz et al, 2021b), where local clients are edge devices with limited resources and computing power, which can number in the millions. Previous works on language modeling in cross-device FL often use small recurrent-based models of less than 10M parameters (Hard et al, 2018b;, while more recent works leverage a variety of efficient techniques for training larger Transformer-based models (Hilmkil et al, 2021;Ro et al, 2022). In this work, we investigate modular strategies applicable to various model architectures for improving training of both small and large models in cross-device FL.…”
Section: Introductionmentioning
confidence: 99%
“…[12] confirms that it is possible to both pre-train and fine-tune BERT models in a federated manner using clinical texts from different silos without moving the data. [13] provides an overview of the applicability of the federated learning to Transformer-based language models. [14] firstly applied federated learning to Transformer-based Neural Machine Translation to avoid sharing customers' chat recording with the server.…”
Section: Introductionmentioning
confidence: 99%