Recent fMRI studies of event segmentation have found that default mode regions represent high-level event structure during movie watching. In these regions, neural patterns are relatively stable during events and shift at event boundaries. Music, like narratives, contains hierarchical event structure (e.g., sections are composed of phrases). Here, we tested the hypothesis that brain activity patterns in default mode regions reflect the high-level event structure of music. We used fMRI to record brain activity from 25 participants (male and female) as they listened to a continuous playlist of 16 musical excerpts, and additionally collected annotations for these excerpts by asking a separate group of participants to mark when meaningful changes occurred in each one. We then identified temporal boundaries between stable patterns of brain activity using a hidden Markov model and compared the location of the model boundaries to the location of the human annotations. We identified multiple brain regions with significant matches to the observer-identified boundaries, including auditory cortex, mPFC, parietal cortex, and angular gyrus. From these results, we conclude that both higher-order and sensory areas contain information relating to the high-level event structure of music. Moreover, the higher-order areas in this study overlap with areas found in previous studies of event perception in movies and audio narratives, including regions in the default mode network.Significance StatementListening to music requires the brain to track dynamics at multiple hierarchical timescales. In our study, we had fMRI participants listen to real-world music (classical and jazz pieces) and then used an unsupervised learning algorithm (a hidden Markov model) to model the high-level event structure of music within participants’ brain data. This approach revealed that default mode brain regions involved in representing the high-level event structure of narratives are also involved in representing the high-level event structure of music. These findings provide converging support for the hypothesis that these regions play a domain-general role in processing stimuli with long-timescale dependencies.