Abstract-A novel optical decision circuit based on a MachZehnder or Michelson interferometer with a gain-clamped semiconductor optical amplifier in each arm is proposed. Simulation results show that the component, of which the decision threshold can easily be modified through adjustment of the currents in both amplifiers, exhibits excellent reshaping capacities.
Lightweight speaker-dependent (SD) automatic speech recognition (ASR) is a promising solution for the problems of possibility of disclosing personal privacy and difficulty of obtaining training material for many seldom used English words and (often non-English) names. Dynamic time warping (DTW) algorithm is the state-of-the-art algorithm for small foot-print SD ASR applications, which have limited storage space and small vocabulary. In our previous work, we have successfully developed two fast and accurate DTW variations for clean speech data. However, speech recognition in adverse conditions is still a big challenge. In order to improve recognition accuracy in noisy and bad recording conditions, such as too high or low recording volume, we introduce a novel weighted DTW method. This method defines a feature index for each time frame of training data, and then applies it to the core DTW process to tune the final alignment score. With extensive experiments on one representative SD dataset of three speakers' recordings, our method achieves better accuracy than DTW, where 0.5% relative reduction of error rate (RRER) on clean speech data and 7.5% RRER on noisy and bad recording speech data. To the best of our knowledge, our new weighted DTW is the first weighted DTW method specially designed for speech data in noisy and bad recording conditions.
Abstract-A rate equation model of a gain clamped semiconductor optical amplifier (GCSOA) is presented. Both a timedomain and a small-signal analysis of those rate equations are used to investigate the crosstalk between different signal channels. It is shown that the crosstalk of GCSOA's strongly depends on the bit rate of the amplified signals and is lower at both very high bit rates and low bit rates. This crosstalk is proportional with the input power and, approximately, with the amplification.
Grammar-based speech recognition systems exhibit performance degradation as their vocabulary sizes increase. Data clustering is deemed to reduce the proportionality of this problem. We introduce an approach to data clustering for automatic speech recognition systems using Kohonen Self-Organized Map. Clustering results are used further to build a language model for each of the clusters using CMUCambridge toolkit. The approach was implemented as a prototype for a large vocabulary and continuous speech recognition system and about 8% performance improvement was achieved in comparison with the performance achieved using the language model and dictionary provided by Sphinx3. In this paper we present the experimental results along with discussions, analysis and potential future directions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.