2011
DOI: 10.1109/mcas.2011.941078
|View full text |Cite
|
Sign up to set email alerts
|

Usable Speech Processing: A Filterless Approach in the Presence of Interference

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 17 publications
(3 citation statements)
references
References 16 publications
0
3
0
Order By: Relevance
“…In this project, the length of each frame is 30 ms and the overlap between consecutive frames is 20 ms. Each frame is multiplied by a Hamming window and effectively represents the middle 10 ms of its entire 30 ms length. Frame selection is achieved by a voice activity detector that keeps speech-like high energy segments (usable frames [65]) and discards silence (not usable frames) [66].…”
Section: A Speaker Identification Projectmentioning
confidence: 99%
“…In this project, the length of each frame is 30 ms and the overlap between consecutive frames is 20 ms. Each frame is multiplied by a Hamming window and effectively represents the middle 10 ms of its entire 30 ms length. Frame selection is achieved by a voice activity detector that keeps speech-like high energy segments (usable frames [65]) and discards silence (not usable frames) [66].…”
Section: A Speaker Identification Projectmentioning
confidence: 99%
“…The second approach is to detect the presence of more than one speaker at every time instance, which is mainly referred to as overlapped-speech detection [4,5,6]. In many applications, including the problem investigated in this study, the latter suffices in mitigating the degradations caused by co-channel speech (for a review see [7]). An example is reducing errors in speaker diarization by omitting overlapping speech regions [8].…”
Section: Introductionmentioning
confidence: 99%
“…Since additive noise at a global signal-to-noise ratio (SNR) corrupts the speech signal non-uniformly over different smaller duration temporal regions, the goal is to find which temporal regions are relatively clean and thereby more useful for speaker recognition. These useful temporal regions are known as the regions of 'usable speech' [4]. The student is currently testing a novel algorithm to blindly detect 'usable speech' and is expected to graduate in one year's time.…”
Section: Example Of An Electrical and Computer Engineering Studentmentioning
confidence: 99%