Across the millennia, and across a range of disciplines, there has been a widespread desire to connect, or translate between, the senses in a manner that is meaningful, rather than arbitrary. Early examples were often inspired by the vivid, yet mostly idiosyncratic, crossmodal matches expressed by synaesthetes, often exploited for aesthetic purposes by writers, artists, and composers. A separate approach comes from those academic commentators who have attempted to translate between structurally similar dimensions of perceptual experience (such as pitch and colour). However, neither approach has succeeded in delivering consensually agreed crossmodal matches. As such, an alternative approach to sensory translation is needed. In this narrative historical review, focusing on the translation between audition and vision, we attempt to shed light on the topic by addressing the following three questions: (1) How is the topic of sensory translation related to synaesthesia, multisensory integration, and crossmodal associations? (2) Are there common processing mechanisms across the senses that can help to guarantee the success of sensory translation, or, rather, is mapping among the senses mediated by allegedly universal (e.g., amodal) stimulus dimensions? (3) Is the term ‘translation’ in the context of cross-sensory mappings used metaphorically or literally? Given the general mechanisms and concepts discussed throughout the review, the answers we come to regarding the nature of audio-visual translation are likely to apply to the translation between other perhaps less-frequently studied modality pairings as well.