Who Said That? a Comparative Study of Non-negative Matrix Factorization Techniques

Krikke, Teun; Broz, Frank; Lane, David M.

doi:10.21437/interspeech.2018-1807

Cited by 1 publication

(3 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The above result may be explained by the fact that all http://journals.uob.edu.bh [25], (b) NMF-EUC [4], (c) NMF-EUC, current investigation on an overlapped speech signal comprising same language speeches by English female and male speakers and (d) NMF-EUC, current investigation on an overlapped speech signal comprising different language by English male and Marathi female speakers. The y-axis scale for all the figures is same Indo-Aryan languages like Marathi and Bengali have more aspirated consonants than English, which are produced with an audible expulsion of breath, whereas the unaspirated are pronounced with minimal breath.…”

Section: B Signal Level Metricsmentioning

confidence: 94%

“…They also point out that there is immense scope for improving audio source separation in overlapping speech scenarios. DNN, though it shows promising results in separation performance, is characterized by high computational complexity and suffers degraded performance on problems with limited training data or small data sets [25]. NMF, on the other hand, is still prevalent for separation with limited training datasets.…”

Section: Introductionmentioning

confidence: 99%

“…Figure5. SDR, SIR, and SAR comparison of (a) NMF-EUC(sparse)[25], (b) NMF-EUC[4], (c) NMF-EUC, current investigation on an overlapped speech signal comprising same language speeches by English female and male speakers and (d) NMF-EUC, current investigation on an overlapped speech signal comprising different language by English male and Marathi female speakers. The y-axis scale for all the figures is same…”

mentioning

confidence: 99%

See 2 more Smart Citations

Non-negative Matrix Factorization on a Multi-lingual Overlapped Speech Signal: A Signal and Perception Level Analysis

Nag¹,

Shah²

2022

IJCDS

View full text Add to dashboard Cite

A complex acoustic scenario comprising overlapping speeches from multiple speakers in the presence of noise renders speech recognition perform poorly in hands-free devices. This scenario turns out to be more complex in India, a country where 96.71% of the population speaks one of the 22 scheduled languages. Therefore, an audio source separation algorithm that mitigates the interference from other speakers and effectively enhances the articulacy and quality of source speech may be added as a pre-processor in speech recognition systems. This research, therefore, investigates the non-negative matrix factorization (NMF) algorithm's effectiveness for the separation of source in an overlapping multi-lingual multi-dialect single-channel speech mixture scenario, an inherent characteristic of a cocktail party problem in India. The objective is to analyze the signal level metrics and perception level metrics of a speech source-separated from a multi-lingual overlapped speech signal. The languages used for the same are English and two Indo-Aryan languages, Marathi and Bengali. One of the experimental results demonstrated that the source to distortion ratio (SDR) of separated target source from English-Bengali and English-Marathi speech mixture is 0.4 and 1.3 dB higher than English-English speech mixed signals, respectively. Therefore, the experiments highlight an improvement in separating sources from mixed speech signals with different language combinations than the same language.

show abstract

Section: B Signal Level Metricsmentioning

confidence: 94%

Section: Introductionmentioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation