Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-1161
|View full text |Cite
|
Sign up to set email alerts
|

Code-Switching Detection Using ASR-Generated Language Posteriors

Abstract: Code-switching (CS) detection refers to the automatic detection of language switches in code-mixed utterances. This task can be achieved by using a CS automatic speech recognition (ASR) system that can handle such language switches. In our previous work, we have investigated the code-switching detection performance of the Frisian-Dutch CS ASR system by using the time alignment of the most likely hypothesis and found that this technique suffers from over-switching due to numerous very short spurious language sw… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
0
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(14 citation statements)
references
References 34 publications
0
0
0
Order By: Relevance
“…In speech processing, work on code-switching can be divided into code-switching detection (Rallabandi et al, 2018;Yılmaz et al, 2016;Wang et al, 2019) using language identification (Choud-hury et al, 2017) andend-to-end recognition (Indra Winata et al, 2018). In this work, we look at both methods via finetuning of self-supervised representations, namely wav2vec 2.0 (Baevski et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…In speech processing, work on code-switching can be divided into code-switching detection (Rallabandi et al, 2018;Yılmaz et al, 2016;Wang et al, 2019) using language identification (Choud-hury et al, 2017) andend-to-end recognition (Indra Winata et al, 2018). In this work, we look at both methods via finetuning of self-supervised representations, namely wav2vec 2.0 (Baevski et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…Additionally, self-supervised fine-tuning methods are also proposed [9,10]. To solve the phonemic confusion issues, the Mixture of Experts (MoE) has become the mainstream model architecture for CS ASR system due to its ability not only to make use of contextual information between different languages, but also to discriminate different language information and extract language-specific representation [11][12][13][14].…”
Section: Introductionmentioning
confidence: 99%
“…Switch-Transformer-based MoE is another novel MoE model architecture, which not only mitigates the additional computational cost issue appeared in bi-encoder-based MoE, but also achieves better recognition performance [13,14,20,21]. However, the prior works still have some drawbacks, either ignoring language information or failing to achieve a real-time streaming ASR system.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations