Automatic Depression Analysis Using Dynamic Facial Appearance Descriptor and Dirichlet Process Fisher Encoding

He, Lang; Jiang, Dongmei; Sahli, Hichem

doi:10.1109/tmm.2018.2877129

Cited by 66 publications

(53 citation statements)

References 59 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To avoid distortion, other studies employed fixed-size histogram or other statistics to summarize the distribution of representations. Specifically, they generate video-level descriptors by computing statistics of features [31], [32], [33], [34], using Gaussian Mixture Model (GMM) [35], [36], [37], [38], [39], [40] or fisher vector [38], [41], etc. Although these methods summarize undistorted information, temporal relations between segments/frames, such as the order of events, are lost after creating the statistics.…”

Section: 21)mentioning

confidence: 99%

“…Early works [29], [31], [32], [51] generally use traditional Machine Learning models, e.g. Support Vector Machine Regression (SVR) [25], [33], decision tree [21], [43], [52], Logistic regression [53], etc., to predict depression from hand-crafted features (Local Binary Pattern (LBP) [38], [41], Low-Level Descriptor (LLD) [21], [34], [43], Histogram of oriented gradients (HOG) [26], etc). For example, Meng et al [29] extracted LBP and EOH as visual features and LLD as audio features, and applied Motion History Histogram (MHH) to extract dynamics from short video segments.…”

Section: Automatic Depression Analysismentioning

confidence: 99%

See 1 more Smart Citation

Spectral Representation of Behaviour Primitives for Depression Analysis

Song

Jaiswal

Shen

et al. 2022

IEEE Trans. Affective Comput.

View full text Add to dashboard Cite

Depression is a serious mental disorder affecting millions of people all over the world. Traditional clinical diagnosis methods are subjective, complicated and require extensive participation of clinicians. Recent advances in automatic depression analysis systems promise a future where these shortcomings are addressed by objective, repeatable, and readily available diagnostic tools to aid health professionals in their work. Yet there remain a number of barriers to the development of such tools. One barrier is that existing automatic depression analysis algorithms base their predictions on very brief sequential segments, sometimes as little as one frame. Another barrier is that existing methods do not take into account what the context of the measured behaviour is. In this paper, we extract multi-scale video-level features for video-based automatic depression analysis. We propose to use automatically detected human behaviour primitives as the low-dimensional descriptor for each frame. We also propose two novel spectral representations, i.e. spectral heatmaps and spectral vectors, to represent video-level multi-scale temporal dynamics of expressive behaviour. Constructed spectral representations are fed to Convolution Neural Networks (CNNs) and Artificial Neural Networks (ANNs) for depression analysis. We conducted experiments on the AVEC 2013 and AVEC 2014 benchmark datasets to investigate the influence of interview tasks on depression analysis. In addition to achieving state of the art accuracy in severity of depression estimation, we show that the task conducted by the user matters, that fusion of a combination of tasks reaches highest accuracy, and that longer tasks are more informative than shorter tasks, up to a point.

show abstract

Section: 21)mentioning

confidence: 99%

Section: Automatic Depression Analysismentioning

confidence: 99%

Spectral Representation of Behaviour Primitives for Depression Analysis

Song

Jaiswal

Shen

et al. 2022

IEEE Trans. Affective Comput.

View full text Add to dashboard Cite

show abstract

“…Though several biomarkers for apathy are discussed in Hampel et al [6], automated apathy diagnosis is a novel research area of high impact and hence interest. The computer vision based analysis of face and gesture has shown to provide abundant information about different neurodegenerative disorders [10], [11], [12], [13], which we here aim at exploiting for apathy diagnosis.…”

Section: Related Workmentioning

confidence: 99%

“…Reduced facial expressions or hypomimia was found to be a major cue for estimating the stage severity of Parkinson's disease [24]. The facial expression features (facial appearance and dynamics) were used to estimate the clinical depression scores [12]. Head and face movements: According to Hammal and Cohn [25], head motion also plays an important role in emotion communication.…”

Section: Related Workmentioning

confidence: 99%

Characterizing the State of Apathy with Facial Expression and Motion Analysis

Happy

Dantcheva

Das

et al. 2019

2019 14th IEEE International Conference on Automatic Face &Amp; Gesture Recognition (FG 2019)

View full text Add to dashboard Cite

Reduced emotional response, lack of motivation, and limited social interaction comprise the major symptoms of apathy. Current methods for apathy diagnosis require the patient's presence in a clinic, and time consuming clinical interviews and questionnaires involving medical personnel, which are costly and logistically inconvenient for patients and clinical staff, hindering among other large scale diagnostics. In this paper we introduce a novel machine learning framework to classify apathetic and non-apathetic patients based on analysis of facial dynamics, entailing both emotion and facial movement. Our approach caters to the challenging setting of current apathy assessment interviews, which include short video clips with wide face pose variations, very low-intensity expressions, and insignificant inter-class variations. We test our algorithm on a dataset consisting of 90 video sequences acquired from 45 subjects and obtained an accuracy of 84% in apathy classification. Based on extensive experiments, we show that the fusion of emotion and facial local motion produces the best feature set for apathy classification. In addition, we train regression models to predict the clinical scores related to the mental state examination (MMSE) and the neuropsychiatric apathy inventory (NPI) using the motion and emotion features. Our results suggest that the performance can be further improved by appending the predicted clinical scores to the video-based feature representation.

show abstract

“…Jain et al [16] adopted Fisher Vector to encode the original waveform to detect the depression level of individuals. However, the number of Gaussian components was not adapted to the depression detection task, which affected the accuracy of prediction [22]. He et al [12] proposed a four-stream CNN to detect an individual depression level.…”

Section: Related Workmentioning

confidence: 99%

Automatic Depression Level Detection via ℓp-Norm Pooling

et al. 2019

View full text Add to dashboard Cite

Related physiological studies have shown that Mel-frequency cepstral coefficient (MFCC) is a discriminative acoustic feature for depression detection. This fact has led to some works using MFCCs to identify individual depression degree. However, they rarely adopt neural network to capture high-level feature associated with depression detection. And the suitable feature pooling parameter for depression detection has not been optimized. For these reasons, we propose a hybrid network and p-norm pooling combined with least absolute shrinkage and selection operator (LASSO) to improve the accuracy of depression detection. Firstly, the MFCCs of the original speech are divided into many segments. Then, we extract the segment-level feature using the proposed hybrid network, which investigates the depression-related information in the spatial structure, temporal changes and discriminative representation of short-term MFCC segments. Thirdly, p-norm pooling combined with LASSO is adopted to find the optimal pooling parameter for depression detection to generate the utterance-level feature. Finally, depression level prediction is accomplished using support vector regression (SVR). Experiments are conducted on AVEC2013 and AVEC2014. The results demonstrate that our proposed method achieves better performance than the previous algorithms.

show abstract

Automatic Depression Analysis Using Dynamic Facial Appearance Descriptor and Dirichlet Process Fisher Encoding

Cited by 66 publications

References 59 publications

Spectral Representation of Behaviour Primitives for Depression Analysis

Spectral Representation of Behaviour Primitives for Depression Analysis

Characterizing the State of Apathy with Facial Expression and Motion Analysis

Automatic Depression Level Detection via ℓp-Norm Pooling

Contact Info

Product

Resources

About