2022
DOI: 10.1111/lnc3.12474
|View full text |Cite
|
Sign up to set email alerts
|

Computational sociophonetics using automatic speech recognition

Abstract: Recent years have seen numerous advances in natural language processing that can help accelerate sociophonetic work. These include software to align speech recordings with their transcriptions, as well as to transcribe audio automatically. This solves a major bottleneck and will help process larger datasets and test hypotheses more efficiently. This paper will summarise recent progress, highlight relevant examples of sociophonetic research, and comment on the technical and ethical issues at the cutting edge of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 118 publications
0
3
0
Order By: Relevance
“…The arduous nature of human transcription, however, creates a bottleneck. Fortunately, novel technologies allow for automatic transcription of spoken language using artificial intelligence (Coto‐Solano, 2022; Liao et al., 2023). Currently, these approaches have limitations, and they may struggle with noisy settings involving multiple speakers (Southwell et al., 2022).…”
Section: Discussionmentioning
confidence: 99%
“…The arduous nature of human transcription, however, creates a bottleneck. Fortunately, novel technologies allow for automatic transcription of spoken language using artificial intelligence (Coto‐Solano, 2022; Liao et al., 2023). Currently, these approaches have limitations, and they may struggle with noisy settings involving multiple speakers (Southwell et al., 2022).…”
Section: Discussionmentioning
confidence: 99%
“…Recent studies explored fine-tuning of pretrained self-supervised models for ASR using speech from low-resource languages (e.g., Coto-Solano et al 2022;Guillaume et al 2022), and difficulties of modeling resource-scarce languages and dialects were acknowledged in previous work (Aksënova et al, 2022). It remains an open question to what extent model performance is dependent on the amount of fine-tuning data and the type of language, when the total amount of available data for a language is limited.…”
Section: Introductionmentioning
confidence: 99%
“…Underlined text is in a language other than standard Indonesian or English Claims of state-of-the-art performance from finetuning a pretrained ASR model with as little as 10 minutes of labelled data (Baevski et al, 2020) often depend on large-vocabulary language models (San et al, 2023). For contexts where matching language models are not readily available, more realistic results are to be expected, such as in Coto-Solano et al (2022) where median word error rate (WER) ranged from 18-66% for Cook Islands Maori. Even with language models, WERs remained high for low-resource languages: 32.91% for read speech in Bemba language in Sikasote and Anastasopoulos (2022) and 48% for Kurmanji Kurdish in Gupta and Boulianne (2022).…”
Section: Introductionmentioning
confidence: 99%