End-to-end Spoken Language Understanding (SLU) systems, without speech-to-text conversion, are more promising in low resource scenarios. They can be more effective when there is not enough labeled data to train reliable speech recognition and language understanding systems, or where running SLU on edge is preferred over cloud based services. In this paper, we present an approach for bootstrapping end-to-end SLU in low resource scenarios. We show that incorporating layers extracted from pre-trained acoustic models, instead of using the typical Mel filter bank features, lead to better performing SLU models. Moreover, the layers extracted from a model pre-trained on one language perform well even for (a) SLU tasks on a different language and also (b) on utterances from speakers with speech disorder.
Sildenafil citrate (SIL) is used in the treatment of erectile dysfunction and other chronic disorders. For the pharmacokinetic investigation of SIL we developed a simple and sensitive method for the estimation of SIL in rat plasma by reverse phase high-performance liquid chromatography (RP-HPLC). The drug samples were extracted by liquid-liquid extraction with 300 μl of acetonitrile and 5 ml of diethyl ether. Chromatographic separation was achieved on C18 column using methanol:water (85:15 v/v) as mobile phase at a flow rate of 1 ml/min and UV detection at 230 nm. The retention time of SIL was found to be 4.0 min having a separation time less than 5 min. The developed method was validated for accuracy, precision, linearity and recovery. Linearity studies were found to be acceptable over the range of 0.1-6 μg/ml. The method was successfully applied for the analysis of rat plasma sample for the application in pharmacokinetic study, drug interaction, bioavailability and bioequivalence.
The diachronic nature of broadcast news data leads to the problem of Out-Of-Vocabulary (OOV) words in Large Vocabulary Continuous Speech Recognition (LVCSR) systems. Analysis of OOV words reveals that a majority of them are Proper Names (PNs). However PNs are important for automatic indexing of audio-video content and for obtaining reliable automatic transcriptions. In this paper, we focus on the problem of OOV PNs in diachronic audio documents. To enable recovery of the PNs missed by the LVCSR system, relevant OOV PNs are retrieved by exploiting the semantic context of the LVCSR transcriptions. For retrieval of OOV PNs, we explore topic and semantic context derived from Latent Dirichlet Allocation (LDA) topic models, continuous word vector representations and the Neural Bag-of-Words (NBOW) model which is capable of learning task specific word and context representations. We propose a Neural Bag-of-Weighted Words (NBOW2) model which learns to assign higher weights to words that are important for retrieval of an OOV PN. With experiments on French broadcast news videos we show that the NBOW and NBOW2 models outperform the methods based on raw embeddings from LDA and Skip-gram models. Combining the NBOW and NBOW2 models gives a faster convergence during training. Second pass speech recognition experiments, in which the LVCSR vocabulary and language model are updated with the retrieved OOV PNs, demonstrate the effectiveness of the proposed context models. Index Terms-large vocabulary continuous speech recognition, out-of-vocabulary, proper names, semantic context I. INTRODUCTION Broadcast news data are diachronic in nature, characterised by continuous changes in information and content. The frequent variations in linguistic content and vocabulary pose a challenge to Large Vocabulary Continuous Speech Recognition (LVCSR). All possible known words cannot be included in the vocabulary and Language Model (LM) of an LVCSR system because (a) there are many infrequent and new words, particularly Proper Names (PNs), which are not well represented in training data, and (b) it would increase the LVCSR search space and complexity without guaranteeing a decrease in the Word Error Rate (WER). Therefore a practical choice is to leave out a part of the vocabulary, which then leads to Out-Of-Vocabulary
The Neural Bag-of-Words (NBOW) model performs classification with an average of the input word vectors and achieves an impressive performance. While the NBOW model learns word vectors targeted for the classification task it does not explicitly model which words are important for given task. In this paper we propose an improved NBOW model with this ability to learn task specific word importance weights. The word importance weights are learned by introducing a new weighted sum composition of the word vectors. With experiments on standard topic and sentiment classification tasks, we show that (a) our proposed model learns meaningful word importance for a given task (b) our model gives best accuracies among the BOW approaches. We also show that the learned word importance weights are comparable to tf-idf based word weights when used as features in a BOW SVM classifier.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.