L anguage-similarly to other aspects of culture-is an evolutionary system in its own right, constantly shaped by adaptive pressures and neutral processes 1,2 . There are currently about 7,000 spoken languages 3 , an essential aspect of this diversity being represented by their speech sounds (phonetics and phonology). There is wide cross-linguistic variation at this level 4 , and a crucial question concerns the factors and processes driving the emergence and maintenance of this diversity 5 . Most sound changes are due to languageinternal factors, such as co-articulation and misperception 6 , but recent studies suggest that external factors might also generate pressures to which sound systems adapt 5 . As such, it has been suggested that aspects of the physical environment that vary spatially (e.g., altitude or air humidity) affect the physiology of speech production differently in different populations, resulting in differences between the speech sounds that occur in different languages 5,7,8 .However, our own cognitive, physiological and anatomical biases are probably the most important components shaping languages. Biases that are shared by all humans result in linguistic universals and universal tendencies 6,9 . However, in previous work we have argued that the extensive inter-individual variation that exists at all levels-from the molecular to the anatomical, physiological and neuro-cognitive-also plays a role in the emergence of crosslinguistic variation 5,10 .There is widespread variation at all levels between individuals and groups, including in genetics, anatomy and physiology, arising from our complex evolutionary history [11][12][13][14][15] . Here, we focus on variation in the morphology of the vocal tract (VT; see Fig. 1), which, despite the rather sparse evidence 5,[16][17][18][19] is no exception 20-23 . Using high-quality data from a large multi-ethnic sample, we show that the oral part of the VT has overlapping but statistically distinguishable patterns of variation between participants from four broad ethno-linguistic groups. As we argue in detail elsewhere 16,17 , variation in VT anatomy can produce articulatory biases that survive compensatory mechanisms, and that result in subtle acoustic or coarticulatory effects 19,24 . These weak effects can be amplified by the repeated use and transmission of language, influencing the processes of sound change and ultimately affecting the patterns of linguistic diversity 5,16 . However, this is an extremely complex, long and heterogeneous causal path with feedback loops, which must be investigated using methodologies and data from several scientific disciplines 10,16,25 . We have previously shown, using biomechanical modelling, that click production is affected by the shape of the alveolar ridge 17 , and that the covert articulatory strategies used by non-native participants to produce the North American English 'r' sound is influenced by the shape of their hard palate 16 .We test here the hypothesis that the usually weak effects of such 'idiosyncratic' var...