While low-level, cross-modal correspondences between vision and hearing are well-documented (e.g., Kiki-Bouba effect), it is unclear whether cross-modal correspondences are perceivable between complex, multi-dimensional stimuli, like contemporary music and art. Further, previous studies show conflicting results regarding whether audiovisual correspondence affects subjective aesthetic experience. Here, in collaboration with the Kentler International Drawing Space (NYC, USA), we use material from the Music as Image and Metaphor exhibition, consisting of music composed for each work of visual art. Our pre-registered online experiment consisted of 4 conditions: Audio, Visual, Audio-Visual-Intended (artist-intended pairing of art/music), and Audio-Visual-Random (random shuffling). Participants (N=201) were presented with 16 pieces and could click to proceed to the next piece whenever they liked. After each piece, they were asked about their subjective experience. Analyzing results by condition, we found that participants spent significantly more time with Audio, followed by Audiovisual, followed by Visual pieces; however, they felt most moved in the Audiovisual (bi-modal) conditions. The Audiovisual-Intended pieces were perceived to have greater correspondence than those in the Audiovisual-Random condition. Interestingly, though, there were no significant differences for these two conditions on any other subjective rating scale or for time spent. Collectively, these results extend our understanding of cross-modal correspondence to complex, professional, real-world abstract art and contemporary music, and call into question the use of time spent as an implicit measure of aesthetic appreciation in multi-modal conditions.