2021
DOI: 10.48550/arxiv.2104.00235
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Multilingual and code-switching ASR challenges for low resource Indian languages

Anuj Diwan,
Rakesh Vaideeswaran,
Sanket Shah
et al.

Abstract: Recently, there is increasing interest in multilingual automatic speech recognition (ASR) where a speech recognition system caters to multiple low resource languages by taking advantage of low amounts of labeled corpora in multiple languages. With multilingualism becoming common in today's world, there has been increasing interest in code-switching ASR as well. In code-switching, multiple languages are freely interchanged within a single sentence or between sentences. The success of low-resource multilingual a… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 13 publications
0
7
0
Order By: Relevance
“…Koenecke et al ( 2020) compared industrial ASR systems and found a 35% word error rate for African American English compared to 19% for white speakers of American English. There has also been research on how elderly speakers are transcribed with more errors (Pellegrini et al, 2012;Vipperla et al, 2008); Gretter et al (2020) document lower performance when transcribing non-native speech in English and German, and while there is research on transcription of code-switching (Diwan et al, 2021;Li et al, 2019;Seki et al, 2018;Yue et al, 2019), this remains a weakness for most systems. These types of sociolinguistic variables are amongst the ones researchers are most interested in, so one has to be aware of potential differences in ASR performance on the specific data of interest (Hooker, 2021;Martin & Tang, 2020).…”
Section: Automatic Speech Recognition For Sociophoneticsmentioning
confidence: 99%
“…Koenecke et al ( 2020) compared industrial ASR systems and found a 35% word error rate for African American English compared to 19% for white speakers of American English. There has also been research on how elderly speakers are transcribed with more errors (Pellegrini et al, 2012;Vipperla et al, 2008); Gretter et al (2020) document lower performance when transcribing non-native speech in English and German, and while there is research on transcription of code-switching (Diwan et al, 2021;Li et al, 2019;Seki et al, 2018;Yue et al, 2019), this remains a weakness for most systems. These types of sociolinguistic variables are amongst the ones researchers are most interested in, so one has to be aware of potential differences in ASR performance on the specific data of interest (Hooker, 2021;Martin & Tang, 2020).…”
Section: Automatic Speech Recognition For Sociophoneticsmentioning
confidence: 99%
“…The Indic ASR challenge 2021 [2,3] consists of two sub-tasks. In sub-task 1, the main objective is to build a multilingual ASR system for Indian languages.…”
Section: Indic Asr Challenge 2021mentioning
confidence: 99%
“…For sub-task 2, in addition to these models, an E2E conformer [20] model was also used as a baseline. The details of these baseline models can be found in [3].…”
Section: Implementation Detailsmentioning
confidence: 99%
See 1 more Smart Citation
“…However, most of these large scale models skew towards highresourced languages [9] and do not seek to directly optimize for intra-sentential CS ASR between particular language pairs. A more promising direction towards zero-shot CS ASR can be found in prior works which seek to incorporate monolingual data directly to improve CS performance [14][15][16][17][18][19][20][21][22][23][24][25][26][27][28]. In particular, there are several works which achieve joint modeling of CS and monolingual ASR by conditionally factorizing the overall bilingual task into monolingual parts [29][30][31].…”
Section: Introductionmentioning
confidence: 99%