2022
DOI: 10.48550/arxiv.2202.10453
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses

Abstract: Although media content is increasingly produced, distributed, and consumed in multiple combinations of modalities, how individual modalities contribute to the perceived emotion of a media item remains poorly understood. In this paper we present MusicVideos (MuVi), a novel dataset for affective multimedia content analysis to study how the auditory and visual modalities contribute to the perceived emotion of media. The data were collected by presenting music videos to participants in three conditions: music, vis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 62 publications
0
5
0
Order By: Relevance
“…For example, Markis et al have proposed an AI-AMG system with high-level conditional information that is fed into the encoder part of a sequence-to-sequence architecture for generating valence-specific affective music in a more controllable manner (e.g., the music is generated to match a profile of valence values provided by the user) [43]. However, a challenge in this direction is that datasets labelled with reliable affective information are scarce, although some researchers are working on this limitation [10]. Apart from using a conditional architecture, another potential approach to improve controllability is to use the fundamentals of reinforcement learning to train the neural network models [4].…”
Section: Challengesmentioning
confidence: 99%
“…For example, Markis et al have proposed an AI-AMG system with high-level conditional information that is fed into the encoder part of a sequence-to-sequence architecture for generating valence-specific affective music in a more controllable manner (e.g., the music is generated to match a profile of valence values provided by the user) [43]. However, a challenge in this direction is that datasets labelled with reliable affective information are scarce, although some researchers are working on this limitation [10]. Apart from using a conditional architecture, another potential approach to improve controllability is to use the fundamentals of reinforcement learning to train the neural network models [4].…”
Section: Challengesmentioning
confidence: 99%
“…The relationship between music and emotions has been scientifically explored for at least a hundred years (e.g., Seashore [ 7 ]), with a surge of interest beginning in the 1950s by Meyer [ 8 ], and expanding even more widely in recent decades, both in music psychology [ 9 , 10 , 11 ] and Music Information Retrieval (MIR) [ 5 , 12 , 13 , 14 ]. Generally, there are two main ways of capturing and representing emotion in music: categorical and dimensional [ 15 ].…”
Section: Related Workmentioning
confidence: 99%
“…More recently, [ 25 ] provided a method for mapping discrete emotional terms onto Russell’s dimensional model. Finally, Chua et al [ 5 ]’s MuVi dataset provides both dynamic ratings along the valence and arousal dimensions throughout the song, as well as static emotion labels (categorical labels for the entire song) for music and video stimuli.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations