Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge 2014
DOI: 10.1145/2661806.2661812
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Depression Scale Prediction using Facial Expression Dynamics and Regression

Abstract: Depression is a state of low mood and aversion to activity that can affect a person's thoughts, behavior, feelings and sense of well-being. In such a low mood, both the facial expression and voice appear different from the ones in normal states. In this paper, an automatic system is proposed to predict the scales of Beck Depression Inventory from naturalistic facial expression of the patients with depression. Firstly, features are extracted from corresponding video and audio signals to represent characteristic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
49
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 86 publications
(49 citation statements)
references
References 30 publications
0
49
0
Order By: Relevance
“…Table 3 compares the MAE and RMSE accuracy of proposed and related methods on the AVEC2014 dataset. The methods based on handcrafted features are in [22,23,24]. We can cite as an example, the baseline method provided by AVEC2014 competition which is based on Local Binary Pattern (LBP) and LPQ.…”
Section: Experimental Analysismentioning
confidence: 99%
“…Table 3 compares the MAE and RMSE accuracy of proposed and related methods on the AVEC2014 dataset. The methods based on handcrafted features are in [22,23,24]. We can cite as an example, the baseline method provided by AVEC2014 competition which is based on Local Binary Pattern (LBP) and LPQ.…”
Section: Experimental Analysismentioning
confidence: 99%
“…Using AVEC and a few non-publicly available resources [25], audiovisual detection of depression has been proposed [26], [27] [28], [29], [30], [31], [32], [33]. In [28] for instance, visual bag-of-words (BoW) features computed from space time interest points (STIP), were combined with melfrequency cepstral coefficients (MFCCs) features.…”
Section: Introductionmentioning
confidence: 99%
“…The extracted audiovisual features were encoded using a Fisher Vector representation and a linear SVR was used to learn BDI score classification. In [31], visual Motion History Histogram (MHH) features were measured from three different visual texture features (Local Binary Patterns, Edge Orientation Histogram, and Local Phase Quantization) and combined with low-level audio descriptors provided in [21]. Partial Least Square (PLS) and Linear regression algorithms were used to model the mapping between the extracted features and BDI scores for face and voice features separately, followed by a decision based combination.…”
Section: Introductionmentioning
confidence: 99%
“…We will also extend the proposed method on the BlackDog Institute clinical depression data [17]. [32] 8.12 6.31 Bimodal (Au, Vi) Kachele et al [19] 9.70 7.28 Multimodal (Au, Vi, Meta) Jan et al [15] 10.26 8.30 Bimodal (Au, Vi) Perez et al [27] 10.82 8.99 Bimodal (Au, Vi) Perez et al [27] 11.91 9.35 Unimodal (Au) Jain et al [14] 10.24 8.39 Unimodal (Vi) Gupta et al [13] 10.33 -Multimodal (Au, Vi, Text) Kaya et al [20] 9.61 7.69 Bimodal (Au, Vi) Kaya et al [20] 9.97 7.96 Unimodal (Vi) Baseline [30] 9.98 7.89 Bimodal (Au, Vi) Video Baseline [30] 10.85 8.85 Unimodal (Vi) …”
Section: Discussionmentioning
confidence: 99%