The notion of harmony was first developed in the context of metaphysics before being applied to the domain of music. However, in recent centuries, the term has often been used to describe especially pleasing combinations of colors by those working in the visual arts too. Similarly, the harmonization of flavors is nowadays often invoked as one of the guiding principles underpinning the deliberate pairing of food and drink. However, beyond the various uses of the term to describe and construct pleasurable unisensory perceptual experiences, it has also been suggested that music and painting may be combined harmoniously (e.g., see the literature on “color music”). Furthermore, those working in the area of “sonic seasoning” sometimes describe certain sonic compositions as harmonizing crossmodally with specific flavor sensations. In this review, we take a critical look at the putative meaning(s) of the term “harmony” when used in a crossmodal, or multisensory, context. Furthermore, we address the question of whether the term's use outside of a strictly unimodal auditory context should be considered literally or merely metaphorically (i.e., as a shorthand to describe those combinations of sensory stimuli that, for whatever reason, appear to go well together, and hence which can be processed especially fluently).