We propose a robust and fast d巴reverberation technique for real-time speech recognition application. First, we effectively identifシthe late reflection components of the room impulse response. We use this information together with the con cept of Spectral Subtraction (SS) to remove the late refl ection components of the reverberant signal. In the absence of the c]ean speech in actual scenario, approximation is carried out in estimating the late reflection where the estimation e汀or is corrected through multi-band SS. The multi-band coefficients are optimized during offline training加d used in the actual online dereverberation. The proposed method performs bet ter and faster th叩the relevant approach using Multi-LPC and reverberant matched model. Moreover the proposed method is robust to sp巴水er and microphone locations.
Automatic speech recognition (ASR) in reverberant environments is a challenging task. Most dereverberation techniques address this problem through signal processing and enhances the reverberant waveform independent from the speech recognizer. In this paper, we propose a novel scheme to perform dereverberation in relation with the likelihood of the back-end ASR system. Our proposed approach effectively selects the dereverberation parameters, in the form of multiband scale factors, so that they improve the likelihood of the acoustic model. Then, the acoustic model is retrained using the optimal parameters. During the recognition phase, we implement additional optimization of the parameters. By using Gaussian mixture model (GMM), the process for selecting the scale factors become efficient. Moreover, we remove the dependency of the adopted dereverberation technique on the room impulse response (RIR) measurement, by using an artificial RIR generator and selecting based on the acoustic likelihood. Experimental results show significant improvement in recognition performance with the proposed method over the conventional approach.
Abstract-Several researches have been conducted to recognize emotions using various modalities such as facial expressions, gestures, speech or physiological signals. Among all these modalities, physiological signals are especially interesting because they are mainly controlled by the autonomic nervous system. It has been shown for example that there is an undeniable relationship between emotional state and Heart Rate Variability (HRV). In this paper, we present a methodology to monitor emotional state from physiological signals acquired remotely. The method is based on a remote photoplethysmography (rPPG) algorithm that estimates remote Heart Rate Variability (rHRV) using a simple camera. We first show that the rHRV signal can be estimated with a high accuracy (more than 96% in frequency domain). Then, frequency-feature of rHRV is calculated and we show that there is a strong correlation between the rHRV feature and different emotional states. This observation has been validated on 12 out of 16 volunteers and video-induced emotions which opens the way to contactless monitoring of emotions from physiological signals.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.