2020
DOI: 10.1109/access.2020.2998532
|View full text |Cite
|
Sign up to set email alerts
|

Recognition of Audio Depression Based on Convolutional Neural Network and Generative Antagonism Network Model

Abstract: This paper proposes an audio depression recognition method based on convolution neural network and generative antagonism network model. First of all, preprocess the data set, remove the longterm mute segments in the data set, and splice the rest into a new audio file. Then, the features of speech signal, such as Mel-scale Frequency Cepstral Coefficients (MFCCs), short-term energy and spectral entropy, are extracted based on audio difference normalization algorithm. The extracted matrix vector feature data, whi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 29 publications
(8 citation statements)
references
References 30 publications
0
8
0
Order By: Relevance
“…We found that the KRR model with 40-dimensional principal components had the best performance, which was inputted with both facial and gaze features. Compared with some previous studies on auto recognition of depression, the proposed model has a smaller Mae value [ 39 , 40 ] and a larger p -value [ 41 ]. This shows that the model performed well.…”
Section: Discussionmentioning
confidence: 83%
“…We found that the KRR model with 40-dimensional principal components had the best performance, which was inputted with both facial and gaze features. Compared with some previous studies on auto recognition of depression, the proposed model has a smaller Mae value [ 39 , 40 ] and a larger p -value [ 41 ]. This shows that the model performed well.…”
Section: Discussionmentioning
confidence: 83%
“…The values of S k and W k are different scales and sliding Wireless Communications and Mobile Computing windows, and the calculation formula is [23], CNN [24], and DCNN [25]. The parameter settings of the network used in this research are shown in Table 1.…”
Section: Madn Methodmentioning
confidence: 99%
“…With a purpose of achieving higher accuracy in detection and classification process, some of the significant machine-learning based approaches are viz. identifying depression from audio using convolution neural network in Wang et al [27], Jazaery and Guo [28] applying deep learning on spatio-temporal attributes, Ding et al [29], and Tadesse et al [30] support vector machine over textual data, Cao et al [31] deep neural network for investigate the possibility of bipolar disorder on mobile usage, integrated usage of K-nearest neighborhood, support vector machine, Mahendran et al [32] random forest and multi-layer perceptron for constructing generalized model for stacking. The above-mentioned studies offer a productive guideline for modeling bipolar disorder while the pitfalls associated with it is highlighted in next section.…”
Section: Related Workmentioning
confidence: 99%