Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-1399
|View full text |Cite
|
Sign up to set email alerts
|

Effectiveness of Voice Quality Features in Detecting Depression

Abstract: Automatic assessment of depression from speech signals is affected by variabilities in acoustic content and speakers. In this study, we focused on addressing these variabilities. We used a database comprised of recordings of interviews from a large number of female speakers: 735 individuals suffering from depressive (dysthymia and major depression) and anxiety disorders (generalized anxiety disorder, panic disorder with or without agoraphobia) and 953 healthy individuals. Leveraging this unique and extensive d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
35
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 53 publications
(37 citation statements)
references
References 38 publications
2
35
0
Order By: Relevance
“…This limited generalizability and overfitting are observed for instance in the drop in performance from development to test set in submissions to the AVEC challenges . For results that used held‐out test sets, which are more likely to generalize if they are representative, scores range from close to chance to higher scores including Afshan et al (F1‐score = 0.95) which most likely benefited from having a large sample size (N depressed = 735, N controls = 953) and all participants being the same sex (female). At the same time, Kächele et al obtained one of the highest performances in AVEC 2014 (ie, mean absolute error = 7.08), simply using provided audio baseline features and a random forest classifier (the highest performance combined audio and visual features) .…”
Section: Discussionmentioning
confidence: 99%
“…This limited generalizability and overfitting are observed for instance in the drop in performance from development to test set in submissions to the AVEC challenges . For results that used held‐out test sets, which are more likely to generalize if they are representative, scores range from close to chance to higher scores including Afshan et al (F1‐score = 0.95) which most likely benefited from having a large sample size (N depressed = 735, N controls = 953) and all participants being the same sex (female). At the same time, Kächele et al obtained one of the highest performances in AVEC 2014 (ie, mean absolute error = 7.08), simply using provided audio baseline features and a random forest classifier (the highest performance combined audio and visual features) .…”
Section: Discussionmentioning
confidence: 99%
“…The mean age of the control group was 30.1 years (± 12.6 years), whereas the mean age of the depression group was 42.9 years (± 13.0 years). There is no standardization on age controlling in studies: some selected age-matched controls to their samples (Alghowinem et al 2013b;Alghowinem et al 2012;Cummins et al 2015); and some did not (Afshan et al 2018;Cannizzaro et al 2004;Higuchi et al 2018;Jiang et al 2017;Joshi et al 2013;Liu et al 2015;Ozdas et al, 2004;Scherer et al 2013). Given this heterogeneity, in this work, we assume the perspective of the majority of revised studies in which age between groups was not controlled.…”
Section: Methodsmentioning
confidence: 99%
“…Taken together, our research has replicated previous results in which voice features were found to classify depression and has shown a stable generalizability when applied to new datasets, even under different emotion context. Though the length of voice recordings in our research are around10s, research on the same interview speech dataset has showed even 10 seconds length can reach ideal classification accuracy [60]. What’s more, short utterance has been proved to be effective in speaker identification [6164].…”
Section: Discussionmentioning
confidence: 99%