This paper presents a neuro-fuzzy system to speech classification. We propose a multi-resolution feature extraction technique to deal with adaptive frame size. We utilize fuzzy adaptive resonance theory (FART) to cluster each frame. FART was an extension to ART, performs clustering of its inputs via unsupervised learning. ART describes a family of self-organizing neural networks, capable of clustering arbitrary sequences of input patterns into stable recognition codes. In our experiments, the TIMIT database is used and extracts features of each phoneme. The performance of speech classification is 88.66%, demonstrate the effectiveness of the proposed system is encouraging.