2006
DOI: 10.1109/tasl.2006.872614
|View full text |Cite
|
Sign up to set email alerts
|

Subband Likelihood-Maximizing Beamforming for Speech Recognition in Reverberant Environments

Abstract: Abstract-Speech recognition performance degrades significantly in distant-talking environments, where the speech signals can be severely distorted by additive noise and reverberation. In such environments, the use of microphone arrays has been proposed as a means of improving the quality of captured speech signals. Currently, microphone-array-based speech recognition is performed in two independent stages: array processing and then recognition. Array processing algorithms, designed for signal enhancement, are … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
18
0

Year Published

2008
2008
2017
2017

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 35 publications
(18 citation statements)
references
References 41 publications
0
18
0
Order By: Relevance
“…The weighting factors represented by equations (5), (6), (7), (8), and (9) were independently implemented. These are denoted "equal weighting," "inverse distance weighting," "inverse distance squared weighting," "signal amplitude weighting," and "signal power weighting.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The weighting factors represented by equations (5), (6), (7), (8), and (9) were independently implemented. These are denoted "equal weighting," "inverse distance weighting," "inverse distance squared weighting," "signal amplitude weighting," and "signal power weighting.…”
Section: Resultsmentioning
confidence: 99%
“…McCowan and Sridharan [7] performed sub-band processing of microphone array signals for speech recognition by integrating dynamically-weighted models trained on each sub-array frequency bands based on sub-band speech energy. By also using microphone arrays, Seltzer, Raj, and Stern formulated full-band [4] and sub-band [8] beamforming methods for optimally combining microphone array signals to generate the sequence of features that maximize the likelihood of producing the correct hypothesis. In contrast, Shimizu, Kajita, Takeda, and Itakura [9] developed methods using a fixed sound source that perform speech recognition for each microphone and selects the highest likelihood or equally weights and combines the feature vectors from the microphones.…”
Section: Introductionmentioning
confidence: 99%
“…Additionally, ASR techniques robust to reverberation can be also classified according to the number of microphones used to capture the signal such as single-channel methods [13,20,26,29] or multi-channel techniques [12,27,30,31].…”
Section: Distant-talking Asrmentioning
confidence: 99%
“…In case of multi-channel ASR, there have been studies on designing a beamformer with the aim of optimizing ASR performance. A technique such as likelihood maximizing beamforming (LIMABEAM) [4,5] specifically optimizes array parameters using gradient descent to maximize the likelihood of the recognized hypothesis under an ASR speech model, given the filtered acoustic data. Recent research on LIMABEAM suggests no significant improvement using the standard LIMABEAM on large vocabulary distant speech recognition on the AMI meeting corpus and it is recommended to use a better optimization strategy for any LIMABEAM implementation [6].…”
Section: Introductionmentioning
confidence: 99%