2021
DOI: 10.3390/app11188412
|View full text |Cite
|
Sign up to set email alerts
|

Accented Speech Recognition Based on End-to-End Domain Adversarial Training of Neural Networks

Abstract: The performance of automatic speech recognition (ASR) may be degraded when accented speech is recognized because the speech has some linguistic differences from standard speech. Conventional accented speech recognition studies have utilized the accent embedding method, in which the accent embedding features are directly fed into the ASR network. Although the method improves the performance of accented speech recognition, it has some restrictions, such as increasing the computational costs. This study proposes … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
10
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 16 publications
(13 citation statements)
references
References 16 publications
0
10
1
Order By: Relevance
“…Domain Adversarial Training (DAT) is a popular and common training strategy for domain adaptation [18,42]. The main assumption of DAT is that the training and test data come from different distributions such that an effective domain transfer from the training/source domain to the test/target domain is needed.…”
Section: Domain Adversarial Training (Dat)mentioning
confidence: 99%
See 3 more Smart Citations
“…Domain Adversarial Training (DAT) is a popular and common training strategy for domain adaptation [18,42]. The main assumption of DAT is that the training and test data come from different distributions such that an effective domain transfer from the training/source domain to the test/target domain is needed.…”
Section: Domain Adversarial Training (Dat)mentioning
confidence: 99%
“…[16] proves that performing gradient reversal in domain adversarial training (DAT) is equivalent to minimizing the difference of output distributions of different accents. DAT [29] improved the performance of accented speech recognition for both end-to-end (E2E) [18] and hybrid ASR systems [29]. The experimental results in [29] showed that the performance of DAT was better than that of multi-task learning.…”
mentioning
confidence: 97%
See 2 more Smart Citations
“…With this approach, it is believed that the output representations of the feature extractor can be domaininvariant, so the downstream model can perform comparable results in both source and target domains. [13][14][15][16] trained automatic speech recognition models to deal with accented speech with DAT. [17] proposed to train a multi-lingual speech emotion recognition model with adversarial domain adaptation.…”
Section: Introductionmentioning
confidence: 99%