Sparse Non-negative Matrix Factorization (SNMF) and Deep Neural Networks (DNN) have emerged individually as two efficient machine learning techniques for single-channel speech enhancement. Nevertheless, there are only few works investigating the combination of SNMF and DNN for speech enhancement and robust Automatic Speech Recognition (ASR). In this paper, we present a novel combination of speech enhancement components based-on SNMF and DNN into a full-stack system. We refine the cost function of the DNN to back-propagate the reconstruction error of the enhanced speech. Our proposal is compared with several state-of-the-art speech enhancement systems. Evaluations are conducted on the data of CHiME-3 challenge which consists of real noisy speech recordings captured under challenging noisy conditions. Our system yields significant improvements for both objective quality speech enhancement measurements with relative gain of 30%, and a 10% relative Word Error Rate reduction for ASR compared to the best baselines.
International audienceThe detection and characterization, in audiovisual documents, of speech utterances where person names are pronounced, is an important cue for spoken content analysis. This paper tackles the problematic of retrieving spoken person names in the 1-Best ASR outputs of broadcast TV shows. Our assumption is that a person name is a latent variable produced by the lexical context it appears in. Thereby, a spoken name could be derived from ASR outputs even if it has not been proposed by the speech recognition system. A new context modelling is proposed in order to capture lexical and structural information surrounding a spoken name. The fundamental hypothesis of this study has been validated on broadcast TV documents available in the context of the REPERE challenge
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.