Poetry can be experienced in multiple sensory modalities – for example, someone might read a written poem (i.e., visual modality) or listen to a spoken poem (i.e., an auditory modality). Readers may also follow along with a written poem while listening to the spoken version, and therefore experience poems in a multimodal manner. Here, we examined whether aesthetic judgments of poems differ based on the sensory modality in which they are experienced. In the present study, participants (N=233) rated three subjective characteristics of poems (vividness of evoked imagery, emotional valence, and emotional arousal), as well as the overall aesthetic appeal of the poems. Participants were randomly assigned to one of the three modalities: text-only (N=81), audio-only (N=74), or combined audio/text (N=78). Our results showed that participants found the audio-only modality to be the least aesthetically appealing, as compared to visual-only and combined audiovisual. Additionally, we found that vividness of imagery was the most important predictor of aesthetic appeal overall (across all three modalities) but also identified a significant interaction between stimulus modality and imagery such that vividness was most important for text-only poems. Finally, we replicated prior work on individual differences in aesthetic appeal, as our results indicate low interrater agreement for aesthetic appeal ratings of poems. These findings contribute to our understanding of the complexity of the aesthetic experience in poetry and highlight the significance of individual differences and the role of modality in its appreciation.