This article envisions a new, cross-modal approach to classroom music listening, one that takes advantage of students’ rising screen literacy and the ever-expanding archive of music-related visual material available on DVD and on video sharing sites such as YouTube. It is grounded in current literature on music performance studies, embodied music cognition studies, and screen multimedia studies. Seeking ways to bridge corporeal and cognitive approaches to music analysis, the article links this literature with the recent phenomenon of ‘clip culture’ and provides a sequence of exemplar introductory clip analyses organized according to a framework being developed to inform teacher preparation and classroom work schemes. Cross-modal listening in the context of the emerging clip culture, the article proposes, has the potential to contribute to the ways musical life and education is undergoing reorganization according to plural co-existing contexts and perspectives, including in the areas of pedagogy, technology and media, and repertoire and analysis.