This paper considers whether vowel systems are organized not only around principles of auditory-acoustic dispersion, but also around non-auditory perceptual factors, specifically vision. Three experiments examine variability in the production and perception of the cot-caught contrast among speakers from Chicago, where /ɑ/ (cot) and /ɔ/ (caught) have been influenced by the spread and reversal of the Northern Cities Shift. Dynamic acoustic and articulatory analysis shows that acoustic strength of the contrast is greatest for speakers with NCS-fronted cot, which is distinguished from caught by both tongue position and lip rounding. In hyperarticulated speech, and among younger speakers whose cot-caught contrast is acoustically weak due to retraction of cot, cot and caught tend to be distinguished through lip rounding alone. An audiovisual perception experiment demonstrates that visible lip gestures enhance perceptibility of the cot-caught contrast, such that visibly round variants of caught are perceptually more robust than unround variants. It is argued that articulatory strategies which are both auditorily and visually distinct may be preferred to those that are distinct in the auditory domain alone. Implications are considered for theories of hyperarticulation/clear speech, sound change, and the advancement of low back vowel merger in North American English.