We describe our work on developing a speech recognition system for multi-genre media archives. The high diversity of the data makes this a challenging recognition task, which may benefit from systems trained on a combination of in-domain and out-of-domain data. Working with tandem HMMs, we present Multi-level Adaptive Networks (MLAN), a novel technique for incorporating information from out-of-domain posterior features using deep neural networks. We show that it provides a substantial reduction in WER over other systems, with relative WER reductions of 15% over a PLP baseline, 9% over in-domain tandem features and 8% over the best out-of-domain tandem features.
Hypoxemia is the most common adverse event that happened during gastrointestinal endoscopy. To estimate risk of hypoxemia prior to endoscopy, American Society of Anesthesiology (ASA) classification scores were used as a major predictive factor. But the accuracy of ASA scores for predicting hypoxemia incidence was doubted here, considering that the classification system ignores much information about general health status and fitness of patient that may contribute to hypoxemia. In this retrospective review of clinical data collected prospectively, the data on 4904 procedures were analyzed. The Pearson’s chi-square test or the Fisher exact test was employed to analyze variance of categorical factors. Continuous variables were statistically evaluated using t-tests or Analysis of variance (ANOVA). As a result, only 245 (5.0%) of the enrolled 4904 patients were found to present hypoxemia during endoscopy. Multivariable logistic regressions revealed that independent risk factors for hypoxemia include high BMI (BMI 30 versus 20, Odd ratio: 1.52, 95% CI: 1.13–2.05; P = 0.0098), hypertension (Odd ratio: 2.28, 95% CI: 1.44–3.60; P = 0.0004), diabetes (Odd ratio: 2.37, 95% CI: 1.30–4.34; P = 0.005), gastrointestinal diseases (Odd ratio: 1.77, 95% CI: 1.21–2.60; P = 0.0033), heart diseases (Odd ratio: 1.97, 95% CI: 1.06–3.68; P = 0.0325) and the procedures that combined esophagogastroduodenoscopy (EGD) and colonoscopy (Odd ratio: 4.84, 95% CI: 1.61–15.51; P = 0.0292; EGD as reference). It is noteworthy that ASA classification scores were not included as an independent predictive factor, and susceptibility of youth to hypoxemia during endoscopy was as high as old subjects. In conclusion, some certain pre-existing diseases of patients were newly identified as independent risk factors for hypoxemia during GI endoscopy. High ASA scores are a confounding predictive factor of pre-existing diseases. We thus recommend that youth (≤18 yrs), obese patients and those patients with hypertension, diabetes, heart diseases, or GI diseases should be monitored closely during sedation endoscopy.
The automatic speaker verification (ASV) has achieved significant progress in recent years. However, it is still very challenging to generalize the ASV technologies to new, unknown and spoofing conditions. Most previous studies focused on extracting the speaker information from natural speech. This paper attempts to address the speaker verification from another perspective. The speaker identity information was exploited from singing speech. We first designed and released a new corpus for speaker verification based on singing and normal reading speech. Then, the speaker discrimination was compared and analyzed between natural and singing speech in different feature spaces. Furthermore, the conventional Gaussian mixture model, the dynamic time warping and the state-of-the-art deep neural network were investigated. They were used to build text-dependent ASV systems with different training-test conditions. Experimental results show that the voiceprint information in the singing speech was more distinguishable than the one in the normal speech. More than relative 20% reduction of equal error rate was obtained on both the gender-dependent and independent 1 s-1 s evaluation tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.