Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-2623
|View full text |Cite
|
Sign up to set email alerts
|

ASR-Free Pronunciation Assessment

Abstract: Most of the pronunciation assessment methods are based on local features derived from automatic speech recognition (ASR), e.g., the Goodness of Pronunciation (GOP) score. In this paper, we investigate an ASR-free scoring approach that is derived from the marginal distribution of raw speech signals. The hypothesis is that even if we have no knowledge of the language (so cannot recognize the phones/words), we can still tell how good a pronunciation is, by comparatively listening to some speech data from the targ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
3
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(6 citation statements)
references
References 20 publications
(23 reference statements)
0
6
0
Order By: Relevance
“…GoP was further extended with Deep Neural Networks (DNNs), replacing Hidden Markov Model (HMM) and Gaussian Mixture Model (GMM) techniques for acoustic modeling [4,5]. Cheng et al [8] improved the performance of GoP with the latent representation of speech extracted in an unsupervised way.…”
Section: Related Workmentioning
confidence: 99%
“…GoP was further extended with Deep Neural Networks (DNNs), replacing Hidden Markov Model (HMM) and Gaussian Mixture Model (GMM) techniques for acoustic modeling [4,5]. Cheng et al [8] improved the performance of GoP with the latent representation of speech extracted in an unsupervised way.…”
Section: Related Workmentioning
confidence: 99%
“…In addition, the pronunciation practice takes place in a "safe" PUPIL: International Journal of Teaching, Education and Learning ISSN 2457-0648 34 environment, where the learner is not exposed to the evaluation of their peers, and the feedback is individualized to their particular needs (Dai & Wu, 2021;Evers & Chen, 2022). Automatic speech recognition technology has been one of the critical components of pronunciation assessment (Cheng et al, 2020) since the 1990s. According to the authors, pronunciation assessment takes place in three steps: segmentation of speech into smaller unitspronunciation analysis of the speech unitscalculation of the score (comp.…”
Section: Literature Reviewmentioning
confidence: 99%
“…One important reason for the lack of research in this area concerns the technical challenge of automatically recording and assessing speech to use in real-time adaptive learning systems. However, methods to automatically score pronunciation accuracy in real time currently exist (e.g., Moustroufas and Digalakis, 2007 ; Neri et al, 2008 ; Cheng et al, 2020 , and see www.emotech.ai ) and pilot data from our lab shows promising results for the application of such methods in adaptive, speech-based learning systems ( Wilschut et al, 2021 ). In the current study, we will further examine how to use speech in adaptive learning systems.…”
Section: Introductionmentioning
confidence: 96%