An increasing number of studies are exploring the benefits of automatic speech recognition (ASR)–based dictation programs for second language (L2) pronunciation learning (e.g. Chen, Inceoglu & Lim, 2020; Liakin, Cardoso & Liakina, 2015; McCrocklin, 2019), but how ASR recognizes accented speech and the nature of the feedback it provides to language learners is still largely under-researched. The current study explores whether the intelligibility of L2 speakers differs when assessed by native (L1) listeners versus ASR technology, and reports on the types of intelligibility issues encountered by the two groups. Twelve L1 listeners of English transcribed 48 isolated words targeting the /ɪ-i/ and /æ-ε/ contrasts and 24 short sentences that four Taiwanese intermediate learners of English had produced using Google’s ASR dictation system. Overall, the results revealed lower intelligibility scores for the word task (ASR: 40.81%, L1 listeners: 38.62%) than the sentence task (ASR: 75.52%, L1 listeners: 83.88%), and highlighted strong similarities in the error types – and their proportions – identified by ASR and the L1 listeners. However, despite similar recognition scores, correlations indicated that the ASR recognition of the L2 speakers’ oral productions mirrored the L1 listeners’ judgments of intelligibility in the word and sentence tasks for only one speaker, with significant positive correlations for one additional speaker in each task. This suggests that the extent to which ASR approaches L1 listeners at recognizing accented speech may depend on individual speakers and the type of oral speech.
The current study explored the usefulness of mobile-based automatic speech recognition (ASR) pronunciation practice by investigating a) its effects on the production of four English vowels, and b) learners' perception of ASR as a learning tool. A total of 19 Korean university students produced 28 minimal pair sentences containing the English vowel contrasts /i/-/ɪ/ and /ɛ/-/ae/ (e.g., I said beat, I said bit) at pretest and posttest, and completed six sessions of ASR practice outside of class that involved voice-typing a short text, minimal pairs in sentences, and decontextualized minimal pairs. Results of acoustic analysis of F1 and F2 formant frequencies showed a meaningful improvement in frontness for the vowel /i/, but no changes for the other vowels. Overall, the majority of the participants perceived ASR as useful for pronunciation practice, but some showed skepticism and frustration regarding the current state of the technology. Further discussed are the problems and limitations that EFL learners experienced during the ASR training.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.