One of the most important issues in speech applications involves the preprocessing stage, which is meant to produce a manageable set of significant features, exploiting the capabilities of the classification phase [5]. The most widely used features for speech recognition, and also applied for different tasks involving speech and music signals, are the mel-frequency cepstral coefficients (MFCCs) [1,5]. These are based on the linear model of voice production and a psychoacoustic scale [5]. Even though MFCCs provide acceptable performance under laboratory conditions, recognition rates degrade significantly in presence of noise. This has motivated many advances in the development of robust feature extraction approaches, like perceptual linear prediction (PLP) and relative spectra [1]. More recently, speech processing techniques based on computational intelligence tools have been developed. For example, several approaches based on evolutionary computation have been proposed for the search of optimal speech representations [8]. Wavelet based processing provides useful tools for the analysis of nonstationary signals, which have been found suitable for speech feature extraction [6]. In order to build a representation based on the wavelet packet transform (WPT), frequently a particular orthogonal basis is selected among all the available basis [6]. However, for speech recognition there is no evidence showing the convenience of the use of orthogonal basis. Therefore, removing the orthogonality restriction the complete WPT decomposition offers a highly redundant set of coefficients, some of which can be selected to build an optimal representation.The optimisation of wavelet decompositions for feature extraction has been studied in many different ways, though it is still an open challenge in speech processing. For example, the optimisation of wavelet decompositions by means of evolutionary algorithms was proposed for image watermarking [4] and for signal denoising [2]. In [9] we proposed a novel approach for the optimisation of over-complete decompositions from a WPT dictionary based on a multi-objective genetic algorithm (MOGA). The MOGA allows to maximise the classification accuracy while minimising the number of features. For the purpose of obtaining appropriate features for state of the art speech recognizers, a classifier based on hidden Markov models (HMM) is used to estimate the capability of candidate solutions, using on a set of English phonemes. The proposed method, which we refer to as evolutionary wavelet packets (EWP), exploits the benefits provided