This article discusses research into the role of audio-visual input for second language (L2) or foreign language learning. It also addresses questions related to the effectiveness of audio-visual input with different types of on-screen text such as subtitles (i.e., in learners’ first language) and captions (i.e., subtitles in the same language as the L2 audio) for L2 learning. The review discusses the following themes: (a) the characteristics of audio-visual input such as the multimodal nature of the input and vocabulary demands of video; (b) L2 learners’ comprehension of audio-visual input and the role of different types of on-screen text; (c) the effectiveness of audio-visual input and on-screen text for aspects of L2 learning including vocabulary, grammar, and listening; and (d) research into L2 learners’ use and perceptions of audio-visual input and on-screen text. The review ends with a consideration of implications for teaching practice and a conclusion that discusses the generalizability of current research in relation to suggestions for future research.