International Conference on Acoustics, Speech, and Signal Processing
DOI: 10.1109/icassp.1990.115970
|View full text |Cite
|
Sign up to set email alerts
|

Hidden Markov model decomposition of speech and noise

Abstract: This paper addresses the problem of automatic speech recognition in the presence of interfering signals and noise with statistical characteristics ranging from stationary to fast changing and impulsive.A technique of signal decomposition using hidden Markov models, 111, is described. This is a generalisation of conventional hidden Markov modelling that provides an optimal method of decomposing simultaneous processes. The technique exploits the ability of hidden Markov models to model dynamically varying signal… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
140
0
3

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 344 publications
(143 citation statements)
references
References 4 publications
0
140
0
3
Order By: Relevance
“…We denote the observed speech and additive noise mixture in channel d of time frame τ in the log-mel-spectral domain by Y (τ, d). According to the so-called logmax approximation (Nádas et al, 1989;Varga and Moore, 1990) the log-magnitude-compressed observations can be approximated as…”
Section: Mask Estimationmentioning
confidence: 99%
“…We denote the observed speech and additive noise mixture in channel d of time frame τ in the log-mel-spectral domain by Y (τ, d). According to the so-called logmax approximation (Nádas et al, 1989;Varga and Moore, 1990) the log-magnitude-compressed observations can be approximated as…”
Section: Mask Estimationmentioning
confidence: 99%
“…This is reminiscent of the 'HMM decomposition' approach [Moore 1986, Varga & Moore 1990] which treats a mixture as the combined output of several processes, each described by a hidden Markov model. In that case, the observation probability is…”
Section: Recognizing Speech Given Nonspeech Estimatesmentioning
confidence: 99%
“…At the hidden layer, articulatory-based approaches to speech recognition are becoming more popular [6], where the speech signal is represented by multiple semi-synchronous streams of articulatory gestures. Early multistream work also includes that of HMM decomposition [7], where both speech and noise are consider a separate stream. Dynamic Bayesian networks (DBNs) have also been used for multi-stream [8], including audio-visual speech recognition [9,10,11].…”
Section: Introductionmentioning
confidence: 99%