2016
DOI: 10.5120/ijca2016912112
|View full text |Cite
|
Sign up to set email alerts
|

Itakura-Saito Divergence Non Negative Matrix Factorization with Application to Monaural Speech Separation

Abstract: Monaural source separation is an interesting area that has received much attention in the signal processing community as it is a pre-processing step in many applications. However, many solutions have been developed to achieve clean separation based on Non-Negative Matrix Factorization (NMF). In this work, we proposed a variant of Itakura-Saito Divergence NMF based on source filter model that captures the temporal continuity of speech signal. The algorithm shows a very good separation results for mixture of two… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 5 publications
0
3
0
Order By: Relevance
“…Each of the speakers was made to make a sentence which varied in duration of roughly 4 to 8 seconds at a sampling frequency of 16kHz. The set up was similar to what is obtainable in (Adewusi et al, 2016). The magnitude spectrograms of the time-domain signal were obtained using the Short Time Fourier ( )…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Each of the speakers was made to make a sentence which varied in duration of roughly 4 to 8 seconds at a sampling frequency of 16kHz. The set up was similar to what is obtainable in (Adewusi et al, 2016). The magnitude spectrograms of the time-domain signal were obtained using the Short Time Fourier ( )…”
Section: Methodsmentioning
confidence: 99%
“…It usually comprises low-power transient components such as note tracks as well as higher power components such as tonal parts of sustained notes. IS has been successfully used for the improvement of the separation performance in monaural problem with group sparsity and temporal continuity (Lef`Evre et al, 2011;Fervotte et al, 2009;Adewusi et al, 2016).…”
Section: Itakura-saito (Is) Distancementioning
confidence: 99%
“…Fig. 2 represents the left and right input in the case of stereo audio signals that has been concatenated in time [34] where Fig. 2a shows the single dictionary of spectral atoms which is used to encode both channels via the two coefficient matrices H l dt and H r dt shown in Fig.…”
Section: Tree Structured Wavelet Filter Bankmentioning
confidence: 99%