ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8682974
|View full text |Cite
|
Sign up to set email alerts
|

Attitude Recognition Using Multi-resolution Cochleagram Features

Abstract: Attitudes play an important role in human communication. Models and algorithms for automatic recognition of attitudes therefore may have applications in areas where successful communication and interaction are crucial, such as healthcare, education and digital entertainment. This paper focuses on the task of categorizing speaker attitudes using speech features. Data extracted from video recordings are employed in training and testing of predictive models consisting of different sets of speech features. A novel… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
13
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2
1

Relationship

4
4

Authors

Journals

citations
Cited by 14 publications
(13 citation statements)
references
References 15 publications
0
13
0
Order By: Relevance
“…MRCG functionals: MRCG features were proposed by Chen et al [30] and have since been used in speech related applications such as voice activity detection [61] speech separation [30], and more recently for attitude recognition [38]. MRCG features are based on cochleagrams [62].…”
Section: B Acoustic Feature Extractionmentioning
confidence: 99%
See 1 more Smart Citation
“…MRCG functionals: MRCG features were proposed by Chen et al [30] and have since been used in speech related applications such as voice activity detection [61] speech separation [30], and more recently for attitude recognition [38]. MRCG features are based on cochleagrams [62].…”
Section: B Acoustic Feature Extractionmentioning
confidence: 99%
“…• evaluating the potential of several feature sets designed for different computational paralinguistics tasks (eGeMAPS [35], emobase [36] and ComParE [37]) along with a recently proposed MRCG derived feature set [38], for AD detection. This is, to the best of our knowledge, the first empirical attempt to use these feature sets as "digital biomarkers" for Alzheimer's disease.…”
mentioning
confidence: 99%
“…These types of events might be denoted by specific combinations of body movements and intonation patterns by a speaker within an identifiable time interval. Detection of events related to human behavior, such as human-action (Uijlings et al, 2015), human-activity (Das et al, 2019;Singh and Vishwakarma, 2019), emotion (Haider et al, 2016b;Cowen et al, 2019;Haider and Luz, 2019;Hassan et al, 2019) and engagement (Curtis et al, 2015;Huang et al, 2016) have received increasing attention in the video analysis literature. A video of a talk or presentation will typically contain a combination of different social signals.…”
Section: Introductionmentioning
confidence: 99%
“…In the SAAM project [6], we are employing Ambient Assisted Living (AAL) technologies to analyse activities and health status, and provide personalised multimodal coaching to elderly persons living on their own or in assisted care settings. Such activities and status include mobility, sleep, social activity, air quality, cardiovascular health, diet [15] and attitudes [10].…”
Section: Introductionmentioning
confidence: 99%
“…Audio-visual signals are used in a number of automatic prediction tasks, including cognitive state detection [3], presentation skills assessment [11,13] and emotion recognition [10,12,8,9,14], the latter being also the topic of the audiovideo challenge of the Emotion Recognition in the Wild Challenge (EmotiW 2018) [5] that we address in this paper. The approaches to the audio-visual signal analysis have employed very high-dimensional feature-space consisting of large numbers of potentially relevant acoustic/visual features.…”
Section: Introductionmentioning
confidence: 99%