How language began is one of the oldest questions in science, but theories remain speculative due to a lack of direct evidence. Here, we report two experiments that generate empirical evidence to inform gesture-first and vocal-first theories of language origin; in each, we tested modern humans' ability to communicate a range of meanings (995 distinct words) using either gesture or non-linguistic vocalization. Experiment 1 is a cross-cultural study, with signal Producers sampled from Australia (
n
= 30,
M
age
= 32.63, s.d. = 12.42) and Vanuatu (
n
= 30,
M
age
= 32.40, s.d. = 11.76). Experiment 2 is a cross-experiential study in which Producers were either sighted (
n
= 10,
M
age
= 39.60, s.d. = 11.18) or severely vision-impaired (
n
= 10,
M
age
= 39.40, s.d. = 10.37). A group of undergraduate student Interpreters guessed the meaning of the signals created by the Producers (
n
= 140). Communication success was substantially higher in the gesture modality than the vocal modality (twice as high overall; 61.17% versus 29.04% success). This was true within cultures, across cultures and even for the signals produced by severely vision-impaired participants. The success of gesture is attributed in part to its greater universality (i.e. similarity in form across different Producers). Our results support the hypothesis that gesture is the primary modality for language creation.