Three studies investigated whether and under what conditions the addition of on-screen text would facilitate the learning of a narrated scientific multimedia explanation. Students were presented with an explanation about the process of lightning formation in the auditory alone (nonredundant) or auditory and visual (redundant) modalities. In Experiment 1, the effects of preceding the nonredundant or redundant explanation with a corresponding animation were examined. In Experiment 2, the effects of presenting the nonredundant or redundant explanation with a simultaneous or a preceding animation were compared. In Experiment 3, environmental sounds were added to the nonredundant or redundant explanation. Learning was measured by retention, transfer, and matching tests. Students better comprehended the explanation when the words were presented auditorily and visually rather than auditorily only, provided there was no other concurrent visual material. The overall pattern of results can be explained by a dual-processing model of working memory, which has implications for the design of multimedia instruction.