The perception of tension and release dynamics constitutes one of the essential aspects of music listening. However, modeling musical tension to predict perception of listeners has been a challenge to researchers. Seminal work demonstrated that tension reported continuously by listeners can be accurately predicted from a discrete set of musical features, combining them into a weighted sum of slopes reflecting their combined dynamics over time. However, this model lacks an automatic pipeline for feature extraction that would make it widely accessible to researchers in the field. Here, we propose an updated version of a predictive tension model that operates using a musical audio as the only input. Using state-of-the-art music information retrieval (MIR) methods, we automatically extract a set of five features (i.e., loudness, pitch height, dissonance, tempo, and onset frequency) to use as predictors for musical tension. The algorithm was trained to best predict behavioral tension ratings collected on a variety of pieces, and its performance was tested by assessing the correlation between the predicted tension and unseen continuous behavioral tension ratings. We hope that providing the research community with an open-source algorithm for predicting musical tension will motivate further work in the music cognition field for elucidating tension dynamics and its neural and cognitive correlates for various musical genres and cultures.
Speech and music signals show rhythmicity in their temporal structure with slower rhythmic rates in music than in speech. Speech processing has been related to brain rhythms in the auditory and motor cortex at around 4.5 Hz, while music processing has been associated to motor cortex activity at around 2 Hz reflecting the temporal structures in speech and music. In addition, slow motor cortex brain rhythms were suggested to be central for timing in both domains. It thus remains unclear if domain-general or frequency specific mechanisms are driving speech and music processing. Additionally, for speech processing, auditory-motor cortex coupling and perception-production synchronization at 4.5 Hz have been related to enhanced auditory perception in various tasks. However, it is unknown whether this effect generalizes to synchronization and perception in music at distinct optimal rates. Using a behavioral protocol, we investigate whether (1) perception-production synchronization shows distinct optimal rates for speech and music; (2) optimal rates in perception are predicted by synchronization strength at different time scales. A perception task involving speech and music stimuli and a synchronization task using tapping and whispering were conducted at slow (~2 Hz) and fast rates (~4.5 Hz). Results revealed that synchronization was generally better at slow rates. Importantly, for slow but not for fast rates, tapping showed superior performance when compared to whispering, suggesting domain-specific rate preferences. Accordingly, synchronization performance was highly correlated across domains only at fast but not at slow rates. Altogether, perception of speech and music were optimal at different timescales, and predicted by auditory-motor synchronization strength. Our data suggests different optimal time scales for music and speech processing with partially overlapping mechanisms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.