2013
DOI: 10.2478/aoa-2013-0054
|View full text |Cite
|
Sign up to set email alerts
|

Speech Emotion Recognition under White Noise

Abstract: Speaker's emotional states are recognized from speech signal with Additive white Gaussian noise (AWGN). The influence of white noise on a typical emotion recogniztion system is studied. The emotion classifier is implemented with Gaussian mixture model (GMM). A Chinese speech emotion database is used for training and testing, which includes nine emotion classes (e.g. happiness, sadness, anger, surprise, fear, anxiety, hesitation, confidence and neutral state). Two speech enhancement algorithms are introduced fo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
15
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 38 publications
(15 citation statements)
references
References 25 publications
0
15
0
Order By: Relevance
“…The developed methods fall into two main categories. The first one includes speech enhancement or noise reduction techniques, either by means of speech sample reconstruction [7], noise compensation using histogram equalization [8], adaptive thresholding in the wavelet domain for noise cancellation [9], or spectral subtraction [10]. However, the downside of such methods is that they strongly rely on prior knowledge of noise, speech, or both, which limits their implementations.…”
Section: Introductionmentioning
confidence: 99%
“…The developed methods fall into two main categories. The first one includes speech enhancement or noise reduction techniques, either by means of speech sample reconstruction [7], noise compensation using histogram equalization [8], adaptive thresholding in the wavelet domain for noise cancellation [9], or spectral subtraction [10]. However, the downside of such methods is that they strongly rely on prior knowledge of noise, speech, or both, which limits their implementations.…”
Section: Introductionmentioning
confidence: 99%
“…These noises severely degrade the performance of systems and consequently affect the user experience in real-life conditions Tawari and Trivedi (2010); Schuller et al (2006); Huang et al (2013); Schuller et al (2011). Therefore, a fundamental step in this work is to investigate environmental noise reduction via a novel closed-form solution to the graph signal theory-based method, to reduce noise at features level and improve the performance of emotion prediction.…”
Section: Introductionmentioning
confidence: 99%
“…In Zhang et al (2018) supervised single-channel technique is applied to speech dereverberation and denoising. In Tawari and Trivedi (2010), the authors utilized a speech enhancement technique based on the adaptive thresholding in the wavelet domain to address noises while in Huang et al (2013), the authors studied the influence of additive white Gaussian noise on speakers emotion states via a Gaussian mixture model. However, these methods are performed on discrete emotion conditions.…”
Section: Introductionmentioning
confidence: 99%
“…Most studies use data collected in laboratory setups with little or no background noise or reverberation. The effect of noise on emotion prediction has been explored in [41,42,43], but little work has been done to study the effect of noise and/or reverberation on depression data.…”
Section: Introductionmentioning
confidence: 99%