We present a model for the evolution of vowel sounds in human languages, in which words behave as Brownian particles diffusing in acoustic space, interacting via the vowel sounds they contain. Interaction forces, derived from a simple model of the language-learning process, are attractive at short range and repulsive at long range. This generates sets of acoustic clusters, each representing a distinct sound, which form patterns with similar statistical properties to real vowel systems. Our formulation may be generalized to account for spontaneous self-actuating shifts in system structure which are observed in real languages, and to combine in one model two previously distinct theories of vowel system structure: dispersion theory, which assumes that vowel systems maximize contrasts between sounds, and quantal theory, according to which nonlinear relationships between articulatory and acoustic parameters are the source of patterns in sound inventories. By formulating the dynamics of vowel sounds using interparticle forces, we also provide a simple unified description of the linguistic notion of push and pull dynamics in vowel systems.