ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9414158
|View full text |Cite
|
Sign up to set email alerts
|

Code-Switch Speech Rescoring with Monolingual Data

Abstract: In the automatic speech recognition (ASR) system, how to solve the problem of code-switch speech recognition has been a concern. Code-switch speech recognition is challenging due to data scarcity as well as diverse syntactic structures across languages. In this paper, we focus on the code-switch speech recognition in mainland China, which is obviously different from the Hong Kong and Southeast Asia area in linguistic characteristics. We propose a novel approach that only uses monolingual data for code-switch s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 14 publications
0
2
0
Order By: Relevance
“…For each hypothesis, the second pass models calculate a score (typically, negative loglikelihood) that is combined with the first pass scores to determine the final score [4,5]. Scores can be computed using a language model [4,5,6], a masked language model [7,8], or other hypothesis-level models [9,10]. Rescoring performance has been improved by using contextual information in recent literature.…”
Section: Introductionmentioning
confidence: 99%
“…For each hypothesis, the second pass models calculate a score (typically, negative loglikelihood) that is combined with the first pass scores to determine the final score [4,5]. Scores can be computed using a language model [4,5,6], a masked language model [7,8], or other hypothesis-level models [9,10]. Rescoring performance has been improved by using contextual information in recent literature.…”
Section: Introductionmentioning
confidence: 99%
“…However, most of these large scale models skew towards highresourced languages [7] and do not seek to directly optimize for intra-sentential CS ASR between particular language pairs. A more promising direction towards zero-shot CS ASR can be found in prior works which seek to incorporate monolingual data directly to improve CS performance [12][13][14][15][16][17][18][19][20][21][22]. In particular, there are several works which achieve joint modeling of CS and monolingual ASR by conditionally factorizing the overall bilingual task into monolingual parts [23][24][25].…”
Section: Introductionmentioning
confidence: 99%
“…However, most of these large scale models skew towards highresourced languages [9] and do not seek to directly optimize for intra-sentential CS ASR between particular language pairs. A more promising direction towards zero-shot CS ASR can be found in prior works which seek to incorporate monolingual data directly to improve CS performance [14][15][16][17][18][19][20][21][22][23][24][25][26][27][28]. In particular, there are several works which achieve joint modeling of CS and monolingual ASR by conditionally factorizing the overall bilingual task into monolingual parts [29][30][31].…”
Section: Introductionmentioning
confidence: 99%