Normalization on the modulation spectrum of the subband temporal envelopes for automatic speech recognition in reverberant environments

Lu, Xugang; Unoki, Masashi; Nakamura, Satoshi

doi:10.1145/1667780.1667832

Cited by 2 publications

(1 citation statement)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…8.2.4, was developed to emphasize the critical temporal modulations (and in so doing emphasizes transitions, roughly models forward masking, and reduces sensitivity to irrelevant steady state convolutional factors). More recently, temporal modulation in subbands was normalized to improve ASR in reverberant environments [60].…”

Section: Current Trends In Auditory Feature Analysismentioning

confidence: 99%

Features Based on Auditory Physiology and Perception

Stern¹,

Morgan²

2012

Techniques for Noise Robustness in Automatic Speech Recognition

View full text Add to dashboard Cite

It is well known that human speech processing capabilities far surpass the capabilities of current automatic speech recognition and related technologies, despite very intensive research in automated speech technologies in recent decades. Indeed, since the early 1980's, this observation has motivated the development of speech recognition feature extraction approaches that are inspired by auditory processing and perception, but it is only relatively recently that these approaches have become effective in their application to computer speech processing. The goal of this chapter is to review some of the major ways in which feature extraction schemes based on auditory processing have facilitated greater speech recognition accuracy in recent years, as well as to provide some insight into the nature of current trends and future directions in this area.We begin this chapter with a brief review of some of the major physiological and perceptual phenomena that have motivated feature extraction algorithms based on auditory processing. We continue with a review and discussion of three seminal 'classical' auditory models of the 1980s that have had a major impact on the approaches taken by more recent contributors to this field. Finally, we turn our attention to selected more recent topics of interest in auditory feature analysis, along with some of the feature extraction approaches that have been based on them. We conclude with a discussion of the attributes of auditory models that appear to be most effective in improving speech recognition accuracy in difficult acoustic environments. Some Attributes of Auditory Physiology and PerceptionIn this section we very briefly review and discuss a selected set of attributes of auditory physiology that historically or currently have been the object of attention by developers of Techniques for Noise Robustness in Automatic Speech Recognition Virtanen, Singh, and Raj (eds)

show abstract

Section: Current Trends In Auditory Feature Analysismentioning

confidence: 99%