Techniques for Noise Robustness in Automatic Speech Recognition 2012
DOI: 10.1002/9781118392683.ch8
|View full text |Cite
|
Sign up to set email alerts
|

Features Based on Auditory Physiology and Perception

Abstract: It is well known that human speech processing capabilities far surpass the capabilities of current automatic speech recognition and related technologies, despite very intensive research in automated speech technologies in recent decades. Indeed, since the early 1980's, this observation has motivated the development of speech recognition feature extraction approaches that are inspired by auditory processing and perception, but it is only relatively recently that these approaches have become effective in their a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
6
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
3
3
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(6 citation statements)
references
References 114 publications
(105 reference statements)
0
6
0
Order By: Relevance
“…Hence, C1 may represent early auditory processing with subband filters (as shown in Figure 1) and half-wave rectified nonlinearity (i.e., ReLU). The feature representation steps involved in this ordering resemble the simplified form of auditory processing in the human ear [6], [12].…”
Section: Analysis Of First Layer C1mentioning
confidence: 99%
See 1 more Smart Citation
“…Hence, C1 may represent early auditory processing with subband filters (as shown in Figure 1) and half-wave rectified nonlinearity (i.e., ReLU). The feature representation steps involved in this ordering resemble the simplified form of auditory processing in the human ear [6], [12].…”
Section: Analysis Of First Layer C1mentioning
confidence: 99%
“…We have proposed single layer unsupervised learning model Convolutional Restricted Boltzmann Machine (ConvRBM) to learn filterbanks directly from the speech signals. The computational auditory models for early auditory and auditory cortex are discussed in [12].…”
Section: Introductionmentioning
confidence: 99%
“…Recent studies of speech enhancement have focused on improving the acoustic waveform and testing the improvement using neural-network-based models for the auditory cortex (Fu et al, 2017; K. Tan & Wang, 2018); however, this method overlooks the role of the physiological nonlinearities in the auditory periphery (Drakopoulos et al, 2022; Zaar & Carney, 2022). Speech recognition systems that include a physiologically realistic model of the auditory periphery have been shown to be more robust in noisy environments than the model that does not include the physiological properties of the auditory system (Stern et al, 2012; Stern & Morgan, 2012).…”
Section: Introductionmentioning
confidence: 99%
“…Mel filterbank is the state-of-the-art auditorybased features for the ESC task. Such handcrafted features rely on the simplified auditory models [15]. There are many approaches that are based on data-driven learning and/or optimization of parameters of auditory models.…”
Section: Introductionmentioning
confidence: 99%