This paper presents a reflection on two technologies – automatic speech recognition (ASR) and Text-to-Speech (TTS) – to improve learners’ pronunciation, aiming for successful spoken communication. It sheds some light on the practical usage of these technologies, demonstrating their effectiveness, qualities, and limitations to assist teachers in deciding the most efficient digital resources applied to their students’ needs. A review of literature on previous empirical studies was carried out, with quantitative and/or qualitative studies conducted by researchers in the field, investigating teachers’ and learners' perceptions and the use of ASR and TTS as a pedagogical tool for pronunciation practice. As a result, it was concluded that a) the presented resources seem to have the potential to enhance pronunciation practice, both in terms of perception and production; b) technology can result in considerable benefits to learners, mainly as a supplement to pronunciation teaching; and c) the use of these digital resources is a way of giving learners the opportunity to focus on their specific difficulties and receive personalized feedback while becoming more autonomous in their learning process.
Following the Covid-19 pandemic, digital technology is more present in classrooms than ever. Automatic Speech Recognition (ASR) offers interesting possibilities for language learners to produce more output in a foreign language (FL). ASR is especially suited for autonomous pronunciation learning when used as a dictation tool that transcribes the learner’s speech (McCROCKLIN, 2016). However, ASR tools are trained with monolingual native speakers in mind, not reflecting the global reality of English speakers. Consequently, the present study examined how well two ASR-based dictation tools understand foreign-accented speech, and which FL speech features cause intelligibility breakdowns. English speech samples of 15 Brazilian Portuguese and 15 Spanish speakers were obtained from an online database (WEINBERGER, 2015) and submitted to two ASR dictation tools: Microsoft Word and VoiceNotebook. The resulting transcriptions were manually inspected, coded and categorized. The results show that overall intelligibility was high for both tools. However, many features of normal FL speech, such as vowel and consonant substitution, caused the ASR dictation tools to misinterpret the message leading to communication breakdowns. The results are discussed from a pedagogical viewpoint.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.