2022
DOI: 10.48550/arxiv.2203.08810
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling

Abstract: Speech Emotion Recognition (SER) application is frequently associated with privacy concerns as it often acquires and transmits speech data at the client-side to remote cloud platforms for further processing. These speech data can reveal not only speech content and affective information but the speaker's identity, demographic traits, and health status. Federated learning (FL) is a distributed machine learning algorithm that coordinates clients to train a model collaboratively without sharing local data. This al… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 23 publications
0
1
0
Order By: Relevance
“…There are a handful of attempts in literature for applying FL in speech-related tasks. Some of these applications are: ASR [10,11,12,13,14], Keyword Spotting [15,16], Emotion Recognition [17,18,16], and Speaker Verification [19]. Notably, for combining FL with SSL, the only available works include Federated self-supervised learning (FSSL) [20] for acoustic event detection and [21], where the challenges involved in combining FL & SSL due to hardware limitations on the client are highlighted and a wav2vec 2.0 [4] model is trained with FL on Common-Voice Italian data [22] and fine-tuned for ASR.…”
Section: Related Workmentioning
confidence: 99%
“…There are a handful of attempts in literature for applying FL in speech-related tasks. Some of these applications are: ASR [10,11,12,13,14], Keyword Spotting [15,16], Emotion Recognition [17,18,16], and Speaker Verification [19]. Notably, for combining FL with SSL, the only available works include Federated self-supervised learning (FSSL) [20] for acoustic event detection and [21], where the challenges involved in combining FL & SSL due to hardware limitations on the client are highlighted and a wav2vec 2.0 [4] model is trained with FL on Common-Voice Italian data [22] and fine-tuned for ASR.…”
Section: Related Workmentioning
confidence: 99%