2015
DOI: 10.1016/j.patrec.2015.05.017
|View full text |Cite
|
Sign up to set email alerts
|

Assessing speaker independence on a speech-based depression level estimation system

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(10 citation statements)
references
References 25 publications
0
10
0
Order By: Relevance
“…Paula et al in [25] perform experiments using MFCCs, Shifted-Delta-Cepstra (SDC), Rasta Perceptual Linear Prediction Coefficients (PLP), and spectral and prosodic features as input to an i-vector system with feature concatenation for estimating depression level. Further, experiments in a speaker independent setup are shown in [26]. A multimodal setup using video features as well with MFCC based i-vectors is described in [27].…”
Section: Related Work On Voice Quality and I-vectorsmentioning
confidence: 99%
“…Paula et al in [25] perform experiments using MFCCs, Shifted-Delta-Cepstra (SDC), Rasta Perceptual Linear Prediction Coefficients (PLP), and spectral and prosodic features as input to an i-vector system with feature concatenation for estimating depression level. Further, experiments in a speaker independent setup are shown in [26]. A multimodal setup using video features as well with MFCC based i-vectors is described in [27].…”
Section: Related Work On Voice Quality and I-vectorsmentioning
confidence: 99%
“…The winning entry for the subchallenge [11] exploited changes in correlations across formant frequencies and channels of the delta-mel-cepstrum using a Gaussian mixture model (GMM). Subsequent approaches using AVEC-2013 include Cummins et al [12], who employed acoustic volume analysis combined with GMMs, Scherer et al [13], who explore reduced vowel space as an indicator of distress, and the iVector-based approach of Lopez-Ottero et al [14] applied to four depression severity classes.…”
Section: Related Workmentioning
confidence: 99%
“…Labels of the testing partition were withheld for challenge purposes, so in the present study only the 200 recordings from the development and train sets were used. The dataset was annotated with a single label per recording, corresponding to speakers' scores on BDI-II, which according to its standardized cutoffs can be interpreted as minimal depression for a score of [0-13], mild [14][15][16][17][18][19], moderate [20][21][22][23][24][25][26][27][28]…”
Section: Datasetsmentioning
confidence: 99%
“…The depression level estimation system used for our research represents speech information using the i-vector paradigm [4,16,19]. The use of i-vectors for depression level estimation aims at tackling the variability arising from gender, age, channel, speaker or message in the different recordings.…”
Section: Automatic Depression Detectionmentioning
confidence: 99%
“…The assessed speaker de-identification approaches are suitable for these experiments since they can be straightforwardly applied to any speaker without having to train transformation functions between the input and target speakers [10,11]. Regarding the estimation of depression severity, an approach based on acoustic characteristics and i-vector representation combined with support vector regression is chosen [19] due to the characteristics of AVEC 2014 data and its acceptable results in this experimental framework.…”
Section: Introductionmentioning
confidence: 99%