Admissible wavelet packet sub‐band‐based harmonic energy features for Hindi phoneme recognition

Biswas, Amit; Sahu, Prasanna Kumar; Bhowmick, Anirban; Chandra, Mahesh

doi:10.1049/iet-spr.2014.0282

Cited by 14 publications

(12 citation statements)

References 23 publications

(44 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The researchers showed that the proposed features outperform the MFCC features. Sahu et al (2014) and Biswas et al (2015), proposed a new set of acoustic features that also depend on wavelet packets called Wavelet packet based ERB (Equivalent Rectangular Bandwidth) Cepstral (WERBC). The main idea was to develop wavelet packet tree decomposition similar to the 24 sub-bands of the ERB filters (Sahu et al, 2014).…”

Section: Corresponding Author: Ihsan Al-hassani Department Of Telecomentioning

confidence: 99%

A New Robust Resonance Based Wavelet Decomposition Cepstral Features for Phoneme Recoszgnition

Al-Hassani¹,

Al-Dakkak²,

Assami³

2019

Res. J. Applied Sci.

View full text Add to dashboard Cite

Robust Automatic Speech Recognition (ASR) is a challenging task that has been an active research subject for the last 20 years. And still results are very modest in the highly noisy environments. In this study, we propose a new speech parameterization method based on concatenating two wavelet packet decompositions, one decomposition using low Q-factor wavelet and another with high Q-factor wavelet, to extract speech features suitable for ASR task in noisy conditions. Experiments on TIMIT dataset for phonemes recognition show that the proposed wavelet-based features outperform MFCC in all noisy conditions.

show abstract

Section: Corresponding Author: Ihsan Al-hassani Department Of Telecomentioning

confidence: 99%

A New Robust Resonance Based Wavelet Decomposition Cepstral Features for Phoneme Recoszgnition

Al-Hassani¹,

Al-Dakkak²,

Assami³

2019

Res. J. Applied Sci.

View full text Add to dashboard Cite

show abstract

“…al. [10] performed speech recognition using harmonic energy based features. They proposed a novice approach of new wavelet packet sub-band-based energy features.…”

Section: Related Workmentioning

confidence: 99%

Spectral Features Analysis for Hindi Speech Recognition System

Garg¹

2016

IJRET

View full text Add to dashboard Cite

Automatic speech recognition refers to recognizing the speech utterances and converting them to text through machines. For this purpose, the features forms an extremely important part. The richness of features will predict the performance of the overall system. So, this paper deals with the various speech features that can used for Hindi speech that has been tested for many other languages. In this work, MFCC, PLP, EFCC and LPC have been tested against Hindi Speech Corpus using HMM toolkit HTK 3.4.1. These features have been evaluated using common environment. The main objective of this paper is to summarize and compare the traditional and newer feature extraction methodology in automatic speech recognition system. This work favours EFCC features over other features. EFCC have shown a significant improvement in noisy environment in automatic speech recognition system.

show abstract

“…For example, an entropy-based method for best wavelet packet basis was proposed for electroencephalogram classification [16]. The use of wavelet-based decompositions has also been applied to the development of features for speech and emotion recognition [17,18]. Other interesting proposals involve the use of evolutionary computing for the optimisation of over-complete decompositions for signal approximation [19], for the design of finite impulse response filters [20] and for the extraction frequency-domain features [21].…”

Section: Introductionmentioning

confidence: 99%

Multi‐objective optimisation of wavelet features for phoneme recognition

Vignolo

Rufiner

Milone

2016

IET signal process.

View full text Add to dashboard Cite

One of the most important issues in speech applications involves the preprocessing stage, which is meant to produce a manageable set of significant features, exploiting the capabilities of the classification phase [5]. The most widely used features for speech recognition, and also applied for different tasks involving speech and music signals, are the mel-frequency cepstral coefficients (MFCCs) [1,5]. These are based on the linear model of voice production and a psychoacoustic scale [5]. Even though MFCCs provide acceptable performance under laboratory conditions, recognition rates degrade significantly in presence of noise. This has motivated many advances in the development of robust feature extraction approaches, like perceptual linear prediction (PLP) and relative spectra [1]. More recently, speech processing techniques based on computational intelligence tools have been developed. For example, several approaches based on evolutionary computation have been proposed for the search of optimal speech representations [8]. Wavelet based processing provides useful tools for the analysis of nonstationary signals, which have been found suitable for speech feature extraction [6]. In order to build a representation based on the wavelet packet transform (WPT), frequently a particular orthogonal basis is selected among all the available basis [6]. However, for speech recognition there is no evidence showing the convenience of the use of orthogonal basis. Therefore, removing the orthogonality restriction the complete WPT decomposition offers a highly redundant set of coefficients, some of which can be selected to build an optimal representation.The optimisation of wavelet decompositions for feature extraction has been studied in many different ways, though it is still an open challenge in speech processing. For example, the optimisation of wavelet decompositions by means of evolutionary algorithms was proposed for image watermarking [4] and for signal denoising [2]. In [9] we proposed a novel approach for the optimisation of over-complete decompositions from a WPT dictionary based on a multi-objective genetic algorithm (MOGA). The MOGA allows to maximise the classification accuracy while minimising the number of features. For the purpose of obtaining appropriate features for state of the art speech recognizers, a classifier based on hidden Markov models (HMM) is used to estimate the capability of candidate solutions, using on a set of English phonemes. The proposed method, which we refer to as evolutionary wavelet packets (EWP), exploits the benefits provided

show abstract

Admissible wavelet packet sub‐band‐based harmonic energy features for Hindi phoneme recognition

Cited by 14 publications

References 23 publications

A New Robust Resonance Based Wavelet Decomposition Cepstral Features for Phoneme Recoszgnition

A New Robust Resonance Based Wavelet Decomposition Cepstral Features for Phoneme Recoszgnition

Spectral Features Analysis for Hindi Speech Recognition System

Multi‐objective optimisation of wavelet features for phoneme recognition

Contact Info

Product

Resources

About