OpenBliSSART: Design and evaluation of a research toolkit for Blind Source Separation in Audio Recognition Tasks

Weninger, Felix; Lehmann, Alexander; Schuller, Björn

doi:10.1109/icassp.2011.5946809

Cited by 14 publications

(11 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The minimization of d 1 (7) is performed by the multiplicative update algorithm for convolutive NMF proposed by Smaragdis (2007) and Wang et al (2009), which can be very efficiently implemented using linear algebra routines employing vectorization. Note that the asymptotic complexity of this algorithm is polynomial (O(RMNP)), and linear in each of R := R (s) + R (n) , M, N, and P. All experiments for this paper were performed with the NMF implementations found in our open-source toolkit openBliSSART (Weninger et al (2011b)) to enforce reproducibility of our results.…”

Section: Convolutive Nmf For Speech Enhancementmentioning

confidence: 99%

Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory

Wöllmer

Weninger

Geiger

et al. 2013

Computer Speech & Language

Self Cite

View full text Add to dashboard Cite

This article proposes and evaluates various methods to integrate the concept of bidirectional Long Short-Term Memory (BLSTM) temporal context modeling into a system for automatic speech recognition (ASR) in noisy and reverberated environments. Building on recent advances in Long Short-Term Memory architectures for ASR, we design a novel front-end for contextsensitive Tandem feature extraction and show how the Connectionist Temporal Classification approach can be used as a BLSTM-based back-end, alternatively to Hidden Markov Models (HMM). We combine context-sensitive BLSTM-based feature generation and speech decoding techniques with source separation by convolutive non-negative matrix factorization. Applying our speaker adapted multi-stream HMM framework that processes MFCC features from NMFenhanced speech as well as word predictions obtained via BLSTM networks and non-negative sparse classification (NSC), we obtain an average accuracy of 91.86 % on the PASCAL CHiME Challenge task at signal-to-noise ratios ranging from -6 to 9 dB. To our knowledge, this is the best result ever reported for the CHiME Challenge task.

show abstract

Section: Convolutive Nmf For Speech Enhancementmentioning

confidence: 99%

Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory

Wöllmer

Weninger

Geiger

et al. 2013

Computer Speech & Language

Self Cite

View full text Add to dashboard Cite

show abstract

“…The importance of the former two parameters on separation quality has been pointed out in [24], and various previous studies clearly suggest that different cost functions maybe optimal for different source separation problems [5,30,32]. Still, to our knowledge, the trade-off between separation quality and the RTF has been rarely investigated in the light of these parameters, although the algorithms minimizing different cost functions considerably differ in the number of required matrix operations, and their complexity (cf.…”

Section: Benchmark Performances In Supervised Speech Separationmentioning

confidence: 99%

“…Source code and demonstrations can be found at http://openblissart.github.com/openBliSSART. We have introduced openBliSSART in [32]; since then, one remarkable development has been to parallelize computationally intensive parts of the algorithms on GPUs following the Compute Unified Device Architecture (CUDA) standard. An earlier study [1] proposed the usage of CUDA for NMF, but its evaluation was limited to a single NMF algorithm and the matrix parameterization typically encountered in musical instrument separation.…”

Section: Introductionmentioning

confidence: 99%

Optimization and Parallelization of Monaural Source Separation Algorithms in the openBliSSART Toolkit

Weninger

Schuller

2012

J Sign Process Syst

Self Cite

View full text Add to dashboard Cite

We describe the implementation of monaural audio source separation algorithms in our toolkit openBliSSART (Blind Source Separation for Audio Recognition Tasks). To our knowledge, it provides the first freely available C++ implementation of non-negative matrix factorization (NMF) supporting the Compute Unified Device Architecture (CUDA) for fast parallel processing on graphics processing units (GPUs). Besides integrating parallel processing, openBliSSART introduces several numerical optimizations of commonly used monaural source separation algorithms that reduce both computation time and memory usage. By illustrating a variety of use-cases from audio effects in music processing to speech enhancement and feature extraction, we demonstrate the wide applicability of our application framework for a multiplicity of research and end-user applications. We evaluate the toolkit by benchmark results of the NMF algorithms and discuss the influence of their parameterization on source separation quality and real-time factor. In the result, the GPU parallelization in openBliSSART introduces double-digit speedups with respect to conventional CPU computation, enabling real-time processing on a desktop PC even for high matrix dimensions.

show abstract

“…It includes various source separation algorithms, with a strong focus on variants of Non-Negative Matrix Factorization (NMF). Furthermore, supervised NMF can be performed for source separation as well as audio feature extraction (Weninger et al 2017). It should be noted that openBliSSART has built-in components to separate the HAR-MONIC and DRUM instruments.…”

Section: Openblissartmentioning

confidence: 99%

“…The openBliSSART application is a C++ toolbox that provides Blind Source Separation for Audio Recognition Tasks (Weninger et al 2011). Besides the basic blind (unsupervised) source separation, classification by Support Vector Machines (SVM) using common acoustic features from speech and music processing is implemented.…”

Section: Openblissartmentioning

confidence: 99%

Automatic music genre classification based on musical instrument track separation

Rosner

Kostek

2017

J Intell Inf Syst

View full text Add to dashboard Cite

The aim of this article is to investigate whether separating music tracks at the preprocessing phase and extending feature vector by parameters related to the specific musical instruments that are characteristic for the given musical genre allow for efficient automatic musical genre classification in case of database containing thousands of music excerpts and a dozen of genres. Results of extensive experiments show that the approach proposed for music genre classification is promising. Overall, conglomerating parameters derived from both an original audio and a mixture of separated tracks improve classification effectiveness measures, demonstrating that the proposed feature vector and the Support Vector Machine (SVM) with Co-training mechanism are applicable to a large dataset.

show abstract

OpenBliSSART: Design and evaluation of a research toolkit for Blind Source Separation in Audio Recognition Tasks

Cited by 14 publications

References 11 publications

Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory

Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory

Optimization and Parallelization of Monaural Source Separation Algorithms in the openBliSSART Toolkit

Automatic music genre classification based on musical instrument track separation

Contact Info

Product

Resources

About