ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
DOI: 10.1109/icassp43922.2022.9747897
|View full text |Cite
|
Sign up to set email alerts
|

Don't Speak Too Fast: The Impact of Data Bias on Self-Supervised Speech Models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(8 citation statements)
references
References 23 publications
0
5
0
Order By: Relevance
“…Other corpora exist with transcript ground-truth, typically to train voicerecognition systems, e.g., Common Voice (Ardila et al 2020) and Artie Bias (Meyer et al 2020), but they are not suitable to study transcription bias due to the short sentences and the likelihood that they have been already used to train those systems. (2020); Buolamwini and Gebru (2018)), where face recognition bias is evaluated for several personal characteristics; smart speakers wake-word recognition (e.g., Chen et al (2021); Dubois et al (2020)), where the authors show that certain categories of people are more likely to misactivate popular smart speakers; automated speaker recognition (Hutiri and Ding 2022;Fenu et al 2021;Meng et al 2022;Hajavi and Etemad 2023), where the authors analyze the fairness in recognizing speakers.…”
Section: Related Workmentioning
confidence: 99%
“…Other corpora exist with transcript ground-truth, typically to train voicerecognition systems, e.g., Common Voice (Ardila et al 2020) and Artie Bias (Meyer et al 2020), but they are not suitable to study transcription bias due to the short sentences and the likelihood that they have been already used to train those systems. (2020); Buolamwini and Gebru (2018)), where face recognition bias is evaluated for several personal characteristics; smart speakers wake-word recognition (e.g., Chen et al (2021); Dubois et al (2020)), where the authors show that certain categories of people are more likely to misactivate popular smart speakers; automated speaker recognition (Hutiri and Ding 2022;Fenu et al 2021;Meng et al 2022;Hajavi and Etemad 2023), where the authors analyze the fairness in recognizing speakers.…”
Section: Related Workmentioning
confidence: 99%
“…However, little investigation has been carried out about estimating the WERs gap produced by gender disparities. To name a few [15,16,17]. In this work, we perform an analysis by fine-tuning E2E models with ATC audio from different genders.…”
Section: Contribution and Motivationmentioning
confidence: 99%
“…To address this question, we only focus on w2v2-L-60k and w2v2-L-60k+ models fine-tuned on the 32 hrs and 132 hrs sets, respectively. 15 . We analyze the WERs obtained by greedy decoding to focus only on joint acoustic and language ASR modeling (see Section 2).…”
Section: Unlabeledmentioning
confidence: 99%
See 2 more Smart Citations