Multi-layer perceptrons and trained classification trees are two very different techniques which have recently become popular. Given enough data and time, both methods are capable of performing arbitrary non-linear classification. These two techniques, which developed out of different research communities, have not been previously compared on real-world problems. We first consider the important differences between multi-layer percepmns and classification trees and conclude that there is not enough theoretical basis for the clear-cut superiority of one technique over the other. For this reason, we performed a number of empirical tests on quite different problems in power system load forecasting and speaker-independent vowel identification. We compared the performance for classification and prediction in terms of accuracy outside the training set. In all cases, even with various sizes of training sets, the multi-layer percepmn performed as well as or better than the trained classification trees. We are confident that the univariate version of the trained classification trees do not perform as well as the multi-layer perceptron. More studies are needed, however, on the comparative performance of the linear combination version of the classification trees.
II. BACKGROUND
A. Multi-Laver PerceDtronsThe name "artificial neural networks" has in some communities become almost synonymous with multi-layer perceptrons (MLP's) trained by back-propagation. Our power studies made use of this standard algorithm and our vowel studies made use of a conjugate gradient version [lo] of back-propagation. In all cases the training data consisted of ordered pairs (Q,Y)] for regression, or ((X,C)) for classification. The input to the network is X and the output is, after training, hopefully very close to Y or C .The network consists of a number of "neuron-like" units which multiply neural inputs by weights, sum the products and then pass through an instantaneous sigmoid nonlinearity. Some of these units connect to elements of X. The distinctive feature of multi-layer percepmns is that not all units connect
We describe EAR, an English Alphabet Recognizer that performs speakerindependent recognition of letters spoken in isolation. During recognition, (a) signal processing routines transform the digitized speech into useful representations, (b) rules are applied to the representations t o locate segment boundaries, (c) feature measurements are computed on the speech segments, and (d) a neural network uses the feature measurements to classify the letter. The system was trained on one token of each letter from 120 speakers. Performance was 95% when tested on a new set of 30 speakers. Performance was 96% when tested on a second token of each letter from the original 120 speakers (multi-speaker recognition). EAR is the first fully automatic, neural-network based, speaker-independent spoken letter recognition system. The recognition accuracy is 6% higher than previously reported systems (half the error rate). We attribute the high level of performance to accurate and explicit phonetic segmentation, the use of speech knowledge t o select features that measure the important linguistic information, and the ability of the neural classifier t o model the variability of the data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.