“…Systematic auditory–visual mappings have been found in multiple domains, including between loudness and size (e.g., Smith & Sera, ), pitch and brightness (e.g., Marks, ; Melara, ; Mondloch & Maurer, ), pitch and shape (e.g., Marks, ), and pitch and visuo‐spatial height (e.g., Chiou & Rich, ). Evidence suggests that such correspondences are recruited in spoken language to convey visuo‐spatial properties of linguistic referents (e.g., Nygaard et al, ; Perlman et al, ; Shintel et al, ; Tzeng et al, ). For example, speakers spontaneously modulated their verbal descriptions of vertically moving dots such that descriptions of upward moving dots (e.g., “It is going up.”) were higher pitched than those of downward moving dots (Shintel et al, ).…”