The aim of this commentary is to propose word prosody training as scaffolding for learning the English intonation system. Drawing on Ghosh and Levis (2021), I discuss the pedagogical implications of research on vowel quality in relation to word stress instruction and the nested nature of vowels within syllables, which make up prosodic words. In light of Liu and Reed, (2021) findings on the structural complexity of intonation (i.e., interrelated and interacting components), I hope to demonstrate the applicability of L2 phonology research to improve prosodic structure pedagogy in the context of L2 English pronunciation.According to Ghosh and Levis (2021), word stress errors that introduce concomitant vowel errors highlight a critical role played by vowel quality in listener processing of multi-syllabic words. Pedagogically, how should these findings inform classroom practices? First, does the finding on vowel quality establish a stronger case for prerequisite vowel training in order to promote word-level intelligibility? If so, would increased emphasis on vowel quality entail spending more time on the perception of clear versus reduced English vowels and/or the production mechanisms often missing in students' articulatory settings to make English vowels, especially the reduced vowel (i.e., the schwa)? And what about the need to address relative length-by which I mean English vowel lengths contrasted with the learners' L1 vowel lengths? Based on my own teaching experience, it is quite apparent that without explicit instruction on similarities/differences and the relative nature of vowel length, L2 learners are often ill-equipped to recognize these subtleties.