The aim of this thesis is to investigate experimentally whether cross-linguistic variation in the structure of languages can be motivated by their external environment. It has been sug-gested that variation does not only result from cultural drift and language-internal mecha-nisms but also from social or even physical factors. However, from observational data and correlations between variables alone, it remains difficult to infer the exact underlying mech-anisms. Here, I present a novel experimental approach for studying the relationship between language and environment under controlled laboratory conditions. I argue that to arrive at a causal understanding of linguistic adaptation, we can use a cultural evolutionary approach and simulate the emergence of linguistic structure with humans in the lab. This way, it can be tested which pressures shape linguistic features as they are used for communication and transmitted to new speakers. I focus primarily on cases where linguistic conventions emerge in referential communication games in direct face-to-face interaction. In these set-tings, I test whether specific conventions are more adaptive to solve the same problem un-der different conditions or affordances imposed by the environment. A series of silent-gesture experiments shows that systematicity (the design feature giving language its com-positional power) is sensitive to the communicative environment: Dyads creating novel ges-tural communication systems to communicate pictorial referents are more likely to system-atize traits and create categories that are functionally relevant in the given environment. Ad-ditionally, environmental features, such as the size of the meaning space and visibility of referents, affect the degree to which participants rely on systematic rather than simple ho-listic gestures. This ‘experimental semiotics’ approach thus models how environmental fac-tors could motivate basic linguistic structure.However, for complex real-world phenomena, such as the hotly debated relationship between spatial language and environment, it is difficult to design simple experiments that isolate variables of interest but retain the necessary level of realism. It has been proposed that topography (e.g., landmarks like rivers, slopes) and sociocultural factors (e.g., bilingual-ism, subsistence style, population density) can affect whether speakers rely on an egocentric or geocentric Frame of Reference (FoR) to encode spatial relations, but it remains hard to disentangle the exact contribution of these variables to the cross-linguistic variation we ob-serve.I tackle this issue with a novel paradigm: interactive Virtual Reality (VR) experiments that allow for an unprecedented combination of ecological validity and experimental con-trol. In networked VR settings, participants are immersed in realistic settings such as a for-est or a mountain slope. By having dyads solve spatial coordination games, I show that speakers of English, which is usually associated with an egocentric FoR, are less likely to use egocentric language (e.g., “the orb is to your left”) if there are strong environmental af-fordances that make geocentric language more viable (e.g., “the orb is uphill from you”). Further experiments address whether the cultural ‘success’ of egocentric left/right could be motivated by its applicability across environments. For this, I combine VR with the ‘exper-imental semiotics’ approach, where the game is solved via a novel visual communication channel. I show how the movement data in the 3D world can be correlated with invented signals to measure which FoR participants rely on. In contrast to the English data, I did not find an advantage for geocentric systems in the slope environment, and overwhelmingly egocentric systems emerged. I discuss how this could relate to task-specificity and native language background. More generally, I show how this new way of studying spatial lan-guage with interactive VR games can be used to test hypotheses about linguistic transmis-sion and material culture that could help explain the origins of the egocentric FoR system, which is regarded a fairly recent cultural innovation.Taken together, the thesis comprises several studies testing the relationship between linguistic and environmental variables. Additionally, VR is presented as a novel tool to study spatial language in controlled large-scale settings complementing more traditional fieldwork. More generally, I suggest that VR can be used to study the evolution of language in complex, multimodal settings without sacrificing experimental control.