TENCON 2009 - 2009 IEEE Region 10 Conference 2009
DOI: 10.1109/tencon.2009.5396022
|View full text |Cite
|
Sign up to set email alerts
|

Recent trends and challenges in speech-separation systems research — A tutorial review

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 25 publications
0
5
0
Order By: Relevance
“…This means that the typical pitch range for both male and female from 80 to 500 Hz or time lags from 2 to 12.5 msnow corresponds to 50 to 313 samples, i.e. pitch state space which consists of the union of 0, 1, and 2 pitch hypothesis, since this algorithm tracks up to 2 pitches simultaneously,is described by (1). HMM search to be biased towards S 2 .…”
Section: Hmm Multipitch Trackingmentioning
confidence: 99%
See 1 more Smart Citation
“…This means that the typical pitch range for both male and female from 80 to 500 Hz or time lags from 2 to 12.5 msnow corresponds to 50 to 313 samples, i.e. pitch state space which consists of the union of 0, 1, and 2 pitch hypothesis, since this algorithm tracks up to 2 pitches simultaneously,is described by (1). HMM search to be biased towards S 2 .…”
Section: Hmm Multipitch Trackingmentioning
confidence: 99%
“…Competing speech is the most difficult kind of interference because of the similarity of temporal and spectral characteristics between target and interfering speeches. Work on speech segregation dates back to 70s [1].…”
Section: Introductionmentioning
confidence: 99%
“…Monaural source separation or single-channel source separation works with two learning methods, supervised learning (models can use previous experience to produce outcomes), and unsupervised learning (models do not have previous experience). Existing review articles describe supervised single-channel speaker separation algorithms in either signal processing [21]- [23] or in the time-frequency [24], [25] domains. The conventional single channel speaker separation techniques such as computational auditory sense analysis (CASA) [26], non-negative matrix factorization (NMF) [27], [28] in the signal processing domain and deep learning-based deep clustering (DC) [29], deep attractor networks (DANet) [30], permutation invariant training (PIT) [31], in T-F domain have been reviewed in [32].…”
Section: Introductionmentioning
confidence: 99%
“…Speech enhancement allows for the extraction of the desired speech signal from a mixture of speech with interfering sounds coming from different sources. These methods can be used in hearing aids [ 1 ], smartphones [ 2 ], or, as a pre-processing step, in automatic speech or speaker recognition [ 3 ].…”
Section: Introductionmentioning
confidence: 99%