2020
DOI: 10.48550/arxiv.2005.10406
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Training Keyword Spotting Models on Non-IID Data with Federated Learning

Abstract: We demonstrate that a production-quality keyword-spotting model can be trained on-device using federated learning and achieve comparable false accept and false reject rates to a centrally-trained model. To overcome the algorithmic constraints associated with fitting on-device data (which are inherently non-independent and identically distributed), we conduct thorough empirical studies of optimization algorithms and hyperparameter configurations using large-scale federated simulations. To overcome resource cons… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
12
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(12 citation statements)
references
References 21 publications
0
12
0
Order By: Relevance
“…Applications of Federated Learning in HAR. While many of the early use-cases of FL were related to natural language processing [21], visual recognition [24,49], and speech recognition [22] tasks, we are now also witnessing its applications in the area of human-activity recognition [3,18,48,62,63,77]. We extend this line of work on FL in HAR, albeit in the context of multi-device environments.…”
Section: Related Workmentioning
confidence: 89%
See 1 more Smart Citation
“…Applications of Federated Learning in HAR. While many of the early use-cases of FL were related to natural language processing [21], visual recognition [24,49], and speech recognition [22] tasks, we are now also witnessing its applications in the area of human-activity recognition [3,18,48,62,63,77]. We extend this line of work on FL in HAR, albeit in the context of multi-device environments.…”
Section: Related Workmentioning
confidence: 89%
“…At a high level, FL involves repeating three steps: (i) updating the parameters of a shared prediction model locally on each remote client, (ii) sending the local parameter updates to a central server for aggregation, and (iii) receiving the aggregated prediction model back on the remote client for the next round of local updates. While many of the early use-cases of FL were related to natural language processing [21], visual recognition [24,49], and speech recognition [22] tasks, we are now also witnessing its applications in the area of human-activity recognition [17].…”
Section: Introductionmentioning
confidence: 99%
“…Recently, performing on-device federated training of acoustic models has attracted considerable attention [7,9,12,23,44]. In [23], FL was employed for a keyword spotting task and the development of a wake-word detection system, whereas, [7,9] investigated the efect of non-i.i.d. distributions on the same task.…”
Section: Related Workmentioning
confidence: 99%
“…Despite the growing number of studies applying FL on speech-related tasks [19,20,21,22,23], very few of these have investigated its use for end-to-end (E2E) ASR. To our best knowledge, existing works on FL for ASR typically rely on strong simplifying assumptions for many of these challengesand this results in their experimental settings being still far away from the conditions in which a FL ASR would need to function.…”
Section: Introductionmentioning
confidence: 99%