Schutz and Lipscomb (2007) reported an audiovisual illusion in which the length of the gesture used to produce a sound altered the perception of that sound's duration. This contradicts the widely accepted claim that the auditory system generally dominates temporal tasks because of its superior temporal acuity. Here, in the first of 4 experiments, we show that impact gestures influence duration ratings of percussive but not sustained sounds. In the 2nd, we show that the illusion is present even if the percussive sound occurs up to 700 ms after the visible impact, but disappears if the percussive sound precedes the visible impact. In the 3rd experiment, we show that only the motion after the visible impact influences perceived tone duration. The 4th experiment (replacing the impact gestures with the written text long and short) suggests that the phenomenon is not due to response bias. Given that visual influence in this paradigm is dependent on the presence of an ecologically plausible audiovisual relationship, we conclude that cross-modal causality plays a key role in governing the integration of sensory information.