A temporal warped 2D psychoacoustic modeling for robust speech recognition system

Dai, Peng; Soon, Ing Yann

doi:10.1016/j.specom.2010.09.004

Search citation statements

Order By: Relevance

Paper Sections

Select...

Introduction1

Speech Recognition1

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2011

2015

Publication Types

Select...

Article5

Book1

Relationship

Self Cite1

Independent5

Authors

Journals

Cited by 11 publications

(20 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The auditory mechanism underlying temporal masking is still less well understood than simultaneous masking (e.g., Plack, 1996;Zeng, 1998;Plack et al, 2002;Oberfeld, 2008Oberfeld, , 2009Laback et al, 2011). In applied contexts like audio coding or communication engineering there is still a potential for integrating effects of temporal masking into the models (e.g., Dai and Soon, 2011;Gunawan et al, 2010;Rhebergen et al, 2010).…”

Section: Introductionmentioning

confidence: 99%

Binaural release from masking in forward-masked intensity discrimination: Evidence for effects of selective attention

2012

View full text Add to dashboard Cite

Section: Introductionmentioning

confidence: 99%

Binaural release from masking in forward-masked intensity discrimination: Evidence for effects of selective attention

2012

View full text Add to dashboard Cite

“…For better alignment or matching, normalization of the subband temporal modulation envelopes may be used [17]. The main factors responsible for the stagnation in the fields of speech recognition are environmental noise, channel distortion, and speaker variability [10], [11], [12]. Let us consider simple speech recognition as a pattern matching problem.…”

Section: Speech Recognitionmentioning

confidence: 99%

Effect of MFCC Based Features for Speech Signal Alignments

Singh¹,

Khanna²,

Lehana³

2013

IJNLC

View full text Add to dashboard Cite

show abstract

“…1. Compared with the filters introduced in our previous work [16][17][18], the parameters of the filters (see Table C1) introduced in this paper are positive or zero. Therefore, they are referred to as 'P-filter' and other algorithms will be called 'N-filter' (short for negative filter parameters).…”

mentioning

confidence: 94%

“…For example, a person with a healthy auditory system has little difficulty in communicating with other people in a crowded shopping mall, which would be a very challenging task for modern ASR [12][13][14][15]. Therefore, our approach is to analyze and model the human auditory system in order to improve the performance of ASR [14,16,17]. Psychoacoustics is the study of the physical human auditory system, and many of its theories can be applied to artificial systems [6].…”

mentioning

confidence: 99%

See 1 more Smart Citation

2D Psychoacoustic modeling of equivalent masking for automatic speech recognition

et al. 2015

Self Cite

View full text Add to dashboard Cite

A temporal warped 2D psychoacoustic modeling for robust speech recognition system

Cited by 11 publications

References 18 publications

Binaural release from masking in forward-masked intensity discrimination: Evidence for effects of selective attention

Binaural release from masking in forward-masked intensity discrimination: Evidence for effects of selective attention

Effect of MFCC Based Features for Speech Signal Alignments

2D Psychoacoustic modeling of equivalent masking for automatic speech recognition

Contact Info

Product

Resources

About