The presence of long-term auditory representations for phonemes has been well-established. However, since speech perception is typically audiovisual, we hypothesized that long-term phoneme representations may also contain information on speakers' mouth shape during articulation. We used an audiovisual oddball paradigm in which, on each trial, participants saw a face and heard one of two vowels. One vowel occurred frequently (standard), while another occurred rarely (deviant). In one condition (neutral), the face had a closed, non-articulating mouth. In the other condition (audiovisual violation), the mouth shape matched the frequent vowel. Although in both conditions stimuli were audiovisual, we hypothesized that identical auditory changes would be perceived differently by participants. Namely, in the neutral condition, deviants violated only the audiovisual pattern specific to each block. By contrast, in the audiovisual violation condition, deviants additionally violated long-term representations for how a speaker's mouth looks during articulation. We compared the amplitude of mismatch negativity (MMN) and P3 components elicited by deviants in the two conditions. The MMN extended posteriorly over temporal and occipital sites even though deviants contained no visual changes, suggesting that deviants were perceived as interruptions in audiovisual, rather than auditory only, sequences. As predicted, deviants elicited larger MMN and P3 in the audiovisual violation compared to the neutral condition. The results suggest that long-term representations of phonemes are indeed audiovisual.