Jet is an open-source, high-performance, distributed stream processor built at Hazelcast during the last five years. Jet was engineered with millisecond latency on the 99.99th percentile as its primary design goal. Originally Jet's purpose was to be an execution engine that performs complex business logic on top of streams generated by Hazelcast's In-memory Data Grid (IMDG): a set of in-memory, partitioned and replicated data structures. With time, Jet evolved into a full-fledged, scale-out stream processor that can handle outof-order streams and provide exactly-once processing guarantees. Jet's end-to-end latency lies in the order of milliseconds, and its throughput in the order of millions of events per CPU-core. This paper presents the main design decisions we made in order to maximize the performance per CPU-core, alongside lessons learned, and an empirical performance evaluation.
In this study, we analyze the effect of the catalog-based single-channel speech-music separation method, which we proposed previously, on speech recognition performance. In the proposed method, assuming that we know a catalog of the background music, we developed a generative model for the superposed speech and music spectrograms. We represent the speech spectrogram by a Non-negative Matrix Factorization (NMF) model and the music spectrogram by a conditional Poisson Mixture Model (PMM). In this paper, we propose to recover the speech signals from the mixed signal in time-domain by detecting the active catalog frames using the catalog-based method. We compare the performances of 3 different signal reconstruction techniques; Expectation-Based, Posterior-Based and Time-Domain reconstruction. Moreover, we compare the performance of our system with the performance of the traditional NMF model. Our method outperforms the NMF method in ASR performance and separation performance in most experimental conditions.
TREN (Turkish Recognition ENgine) is a modular, HMMbased (Hidden Markov Model) and speaker-independent speech recognition system whose system software architecture is based on Distributed Component Object Model (DCOM). TREN contains specialized modules that allow a full interoperable platform including a Turkish speech recognizer, feature extractor, end-point detector and a performance monitoring module. TREN has basically two layers: First layer is the central server that distributes the recognition calls to the appropriate remote servers according to their current CPU load of the recognition process after some speech signal preprocessing and the second layer consists of the remote servers which performs the critical recognition task. This component-based architecture enables TREN applicable to distributed environments. TREN is also trained by considering a wide variety of very common words those best represent the Turkish language. In order to obtain a such database a very large word corpus is collected and statistically the widest span of triphones representing Turkish is examined. TREN has been used to assist speech technologies which require a modular and multithreaded recognizer with dynamic load sharing facilities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.