This research assesses how audiovisual speech integration mechanisms are modulated by sensory and cognitive variables. For this purpose, the McGurk effect (McGurk & MacDonald, 1976) was used as an experimental paradigm. This effect occurs when participants are exposed to incongruent auditory and visual speech signals. For example, when an auditory /b/ is dubbed onto a visual /g/, listeners are led to perceive a fused phoneme like /d/. With the reverse presentation, they experience a combination such as /bg/. In two experiments, auditory intensity (40 dB, 50 dB, 60 dB, and 70 dB), face size (large : 19 * 23 cm and small: 1.8 * 2 cm) and instructions ("multiple choice"and "free response") were manipulated. Face size and instruction were between-participants variables in both experiments, whereas intensity was a within-participants variable in the first experiment and a between-participants variable in the second one. The main effect of instruction manipulation was highly significant in both experiments, the "multiple choice" condition giving rise to more illusions than the "free response" condition. Intensity was significant in the second experiment only. Illusions were more numerous at 40 dB than at the other three intensities. Finally, a small effect of face size was observed in the second experiment only, illusions being slightly more numerous with the large face. Those results indicate that the processing chain underlying audiovisual speech perception is modulated by the perceptual salience of the visual and auditory inputs as well as by cognitive variables.