2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).
DOI: 10.1109/icassp.2003.1198717
|View full text |Cite
|
Sign up to set email alerts
|

Robust speech recognition using features based on zero crossings with peak amplitudes

Abstract: This paper presents an extensive study of zero crossings with peak amplitudes (ZCPA) features, that have earlier been shown to outperform both conventional and auditory-based features in presence of additive noise. The study starts by optimizing different parameters involved in ZCPA feature computation, followed by a comparison of ZCPA and MFCC features on two recognition tasks in different background conditions. The main differences between the two feature types were identified, and their individual effects o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
14
0
2

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 16 publications
(16 citation statements)
references
References 5 publications
0
14
0
2
Order By: Relevance
“…These may be used for applications such as in human-computer interactions, document preparation and dictation, telephone voice response systems, database access, hands-free applications as in car phones or voice-enabled PDAs, web enabling via voice, to name but a few. In spite of focused research in this field for the past several decades, the understanding of the acousticphonetic characteristics of speech, speech variability and speech perception is far from complete, and robust speech recognition with high reliability has not been achieved (Kim et al, 1999;Abdelatty et al, 1999;Gajić and Paliwal, 2003). The speech recognition process may work well in clean conditions but degrades significantly in speaker and channel mismatch conditions.…”
Section: Introductionmentioning
confidence: 96%
“…These may be used for applications such as in human-computer interactions, document preparation and dictation, telephone voice response systems, database access, hands-free applications as in car phones or voice-enabled PDAs, web enabling via voice, to name but a few. In spite of focused research in this field for the past several decades, the understanding of the acousticphonetic characteristics of speech, speech variability and speech perception is far from complete, and robust speech recognition with high reliability has not been achieved (Kim et al, 1999;Abdelatty et al, 1999;Gajić and Paliwal, 2003). The speech recognition process may work well in clean conditions but degrades significantly in speaker and channel mismatch conditions.…”
Section: Introductionmentioning
confidence: 96%
“…The ZCPA model has many variable parameters. The scope of this paper covers a limited number of optimisations (using Gajic's recommended values [2] as a starting point). However there are many further potential adaptations possible which may bring about increased performance -frame length, delta window, filterbank parameters, histogram parameters, number of coefficients etc.…”
Section: Discussionmentioning
confidence: 99%
“…Subsequent implementations [2,1] use FIR filterbanks exclusively. A filterbank was designed with 16 Hamming FIR filters of order 61.…”
Section: Zcpa Featuresmentioning
confidence: 99%
See 1 more Smart Citation
“…In particular, C n (0) is the zero-crossing count. Level-crossing analysis has been widely used in engineering, physics, speech recognition and other fields; see [11,[19][20][21]30,35]. For example, if X i = k j=1 [A j cos(iω j ) + B j sin(iω j )] + e i for independent normal random variables A j , B j and e i , then one can estimate the frequencies ω j by considering the zero-crossing counts of X i and its differencing sequences; see [21].…”
Section: Introductionmentioning
confidence: 99%