2011
DOI: 10.1016/j.specom.2010.09.004
|View full text |Cite
|
Sign up to set email alerts
|

A temporal warped 2D psychoacoustic modeling for robust speech recognition system

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
20
0

Year Published

2011
2011
2015
2015

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 11 publications
(20 citation statements)
references
References 18 publications
0
20
0
Order By: Relevance
“…The auditory mechanism underlying temporal masking is still less well understood than simultaneous masking (e.g., Plack, 1996;Zeng, 1998;Plack et al, 2002;Oberfeld, 2008Oberfeld, , 2009Laback et al, 2011). In applied contexts like audio coding or communication engineering there is still a potential for integrating effects of temporal masking into the models (e.g., Dai and Soon, 2011;Gunawan et al, 2010;Rhebergen et al, 2010).…”
Section: Introductionmentioning
confidence: 99%
“…The auditory mechanism underlying temporal masking is still less well understood than simultaneous masking (e.g., Plack, 1996;Zeng, 1998;Plack et al, 2002;Oberfeld, 2008Oberfeld, , 2009Laback et al, 2011). In applied contexts like audio coding or communication engineering there is still a potential for integrating effects of temporal masking into the models (e.g., Dai and Soon, 2011;Gunawan et al, 2010;Rhebergen et al, 2010).…”
Section: Introductionmentioning
confidence: 99%
“…For better alignment or matching, normalization of the subband temporal modulation envelopes may be used [17]. The main factors responsible for the stagnation in the fields of speech recognition are environmental noise, channel distortion, and speaker variability [10], [11], [12]. Let us consider simple speech recognition as a pattern matching problem.…”
Section: Speech Recognitionmentioning
confidence: 99%
“…1. Compared with the filters introduced in our previous work [16][17][18], the parameters of the filters (see Table C1) introduced in this paper are positive or zero. Therefore, they are referred to as 'P-filter' and other algorithms will be called 'N-filter' (short for negative filter parameters).…”
mentioning
confidence: 94%
“…For example, a person with a healthy auditory system has little difficulty in communicating with other people in a crowded shopping mall, which would be a very challenging task for modern ASR [12][13][14][15]. Therefore, our approach is to analyze and model the human auditory system in order to improve the performance of ASR [14,16,17]. Psychoacoustics is the study of the physical human auditory system, and many of its theories can be applied to artificial systems [6].…”
mentioning
confidence: 99%
See 1 more Smart Citation