Incoming information from multiple sensory channels compete for attention. Processing the relevant ones and ignoring distractors, while at the same time monitoring the environment for potential threats, is crucial for survival, throughout the lifespan. However, sensory and cognitive mechanisms often decline in aging populations, making them more susceptible to distraction. Previous interventions in older adults have successfully improved resistance to distraction, but the inclusion of multisensory integration, with its unique properties in attentional capture, in the training protocol is underexplored. Here, we studied whether, and how, a 4-week intervention, which targets audiovisual integration, affects the ability to deal with task-irrelevant unisensory deviants within a multisensory task. Musically naïve participants engaged in a computerized music reading game and were asked to detect audiovisual incongruences between the pitch of a song’s melody and the position of a disk on the screen, similar to a simplistic music staff. The effects of the intervention were evaluated via behavioral and EEG measurements in young and older adults. Behavioral findings include the absence of age-related differences in distraction and the indirect improvement of performance due to the intervention, seen as an amelioration of response bias. An asymmetry between the effects of auditory and visual deviants was identified and attributed to modality dominance. The electroencephalographic results showed that both groups shared an increase in activation strength after training, when processing auditory deviants, located in the left dorsolateral prefrontal cortex. A functional connectivity analysis revealed that only young adults improved flow of information, in a network comprised of a fronto-parietal subnetwork and a multisensory temporal area. Overall, both behavioral measures and neurophysiological findings suggest that the intervention was indirectly successful, driving a shift in response strategy in the cognitive domain and higher-level or multisensory brain areas, and leaving lower level unisensory processing unaffected.