This paper presents new techniques with relevant improvements added to the primary system presented by our group to the Albayzin 2012 LRE competition, where the use of any additional corpora for training or optimizing the models was forbidden. In this work, we present the incorporation of an additional phonotactic subsystem based on the use of phone log-likelihood ratio features (PLLR) extracted from different phonotactic recognizers that contributes to improve the accuracy of the system in a 21.4% in terms of C avg (we also present results for the official metric during the evaluation, F act ). We will present how using these features at the phone state level provides significant improvements, when used together with dimensionality reduction techniques, especially PCA. We have also experimented with applying alternative SDC-like configurations on these PLLR features with additional improvements. Also, we will describe some modifications to the MFCC-based acoustic i-vector system which have also contributed to additional improvements. The final fused system outperformed the baseline in 27.4% in C avg .
Abstract-This work is focused on the context of speech interfaces for controlling household electronic devices. In particular, we present an example of a spoken dialogue system for controlling a Hi-Fi audio system. This system demonstrates that a more natural, flexible and robust dialogue is possible. That is due to both the Bayesian Networks based solution that we propose for dialogue modeling, and also to carefully designed contextual information handling strategies.
We present an approach to adapt dynamically the language models (LMs) used by a speech recognizer that is part of a spoken dialogue system. We have developed a grammar generation strategy that automatically adapts the LMs using the semantic information that the user provides (represented as dialogue concepts), together with the information regarding the intentions of the speaker (inferred by the dialogue manager, and represented as dialogue goals). We carry out the adaptation as a linear interpolation between a background LM, and one or more of the LMs associated to the dialogue elements (concepts or goals) addressed by the user. The interpolation weights between those models are automatically estimated on each dialogue turn, using measures such as the posterior probabilities of concepts and goals, estimated as part of the inference procedure to determine the actions to be carried out. We propose two approaches to handle the LMs related to concepts and goals. Whereas in the first one we estimate a LM for each one of them, in the second one we apply several clustering strategies to group together those elements that share some common properties, and estimate a LM for each cluster. Our evaluation shows how the system can estimate a dynamic model adapted to each dialogue turn, which helps to significantly improve the performance of the speech recognition, which leads to an improvement in both the language understanding and the dialogue management tasks.
Entropy measurements are an accessible tool to perform irregularity and uncertainty measurements present in time series. Particularly in the area of signal processing, Multiscale Permutation Entropy (MPE) is presented as a characterization methodology capable of measuring randomness and non-linear dynamics present in non-stationary signals, such as mechanical vibrations. In this article, we present a robust methodology based on MPE for detection of Internal Combustion Engine (ICE) states. The MPE is combined with Principal Component Analysis (PCA) as a technique for visualization and feature selection and K-Nearest Neighbors (KNN) as a supervised classifier. The proposed methodology is validated by comparing accuracy and computation time with others presented in the literature. The results allow to appreciate a high effectiveness in the detection of failures in bearings (experiment 1) and ICE states (experiment 2) with a low computational consumption.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.