2012
DOI: 10.1109/msp.2012.2207989
|View full text |Cite
|
Sign up to set email alerts
|

Hearing Is Believing: Biologically Inspired Methods for Robust Automatic Speech Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
22
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 50 publications
(22 citation statements)
references
References 44 publications
0
22
0
Order By: Relevance
“…Recently, auditory-inspired Constant-Q Cepstral Coefficients (CQCC) also perform better compared to MFCC features specifically in detecting spoof by unit selection speech synthesis system [12], [13]. Such handcrafted features rely on simplified auditory models [14], [15].…”
Section: Introductionmentioning
confidence: 99%
“…Recently, auditory-inspired Constant-Q Cepstral Coefficients (CQCC) also perform better compared to MFCC features specifically in detecting spoof by unit selection speech synthesis system [12], [13]. Such handcrafted features rely on simplified auditory models [14], [15].…”
Section: Introductionmentioning
confidence: 99%
“…Findings about the auditory system have influenced research in automatic speech recognition (ASR), which often resulted in more robust machine listening [1,2]. Although a closer connection between ASR and human speech recognition (HSR) has been promoted earlier to further our understanding of speech processing in humans and machines [3], there is a comparatively small number of studies that bring back ASR technology that profited from auditory insights to better understand HSR; important exceptions are for instance [4] and [5].…”
Section: Introductionmentioning
confidence: 99%
“…On the other hand, psychoacoustic experiments also demonstrate that the joint spectro-temporal modulations are highly related to speech intelligibility [27] and speech comprehension [11]. The concept of using spectro-temporal modulations has since been adopted in many applications, such as speech intelligibility assessment [28], musical instrument identification [29], and robust feature extraction for automatic speech [30], [31] and speaker recognition [32].…”
Section: Introductionmentioning
confidence: 99%