Abstract. The gold standard for a classifier is the condition of optimality attained by the Bayesian classifier. Within a Bayesian paradigm, if we are allowed to compare the testing sample with only a single point in the feature space from each class, the optimal Bayesian strategy would be to achieve this based on the (Mahalanobis) distance from the corresponding means. The reader should observe that, in this context, the mean, in one sense, is the most central point in the respective distribution. In this paper, we shall show that we can obtain optimal results by operating in a diametrically opposite way, i.e., a so-called "anti-Bayesian" manner. Indeed, we shall show the completely counter-intuitive result that by working with a very few (sometimes as small as two) points distant from the mean, one can obtain remarkable classification accuracies. Further, if these points are determined by the Order Statistics of the distributions, the accuracy of our method, referred to as Classification by Moments of Order Statistics (CMOS), attains the optimal Bayes' bound! This claim, which is totally counter-intuitive, has been proven for many uni-dimensional, and some multi-dimensional distributions within the exponential family, and the theoretical results have been verified by rigorous experimental testing. Apart from the fact that these results are quite fascinating and pioneering in their own right, they also give a theoretical foundation for the families of Border Identification (BI) algorithms reported in the literature.
This paper submits a comprehensive report of the use of Order Statistics (OS) for parametric Pattern Recognition (PR) for various distributions within the exponential family. Although the field of parametric PR has been thoroughly studied for over five decades, the use of the OS of the distributions to achieve this has not been reported. The pioneering work on using OS for classification was presented earlier for the Uniform distribution and for some members of the exponential family, where it was shown that optimal PR can be achieved in a counter-intuitive manner, diametrically opposed to the Bayesian paradigm, i.e., by comparing the testing sample to a few samples distant from the mean. Apart from the results for the Gaussian and doubleexponential which are merely cited here, our new results include the Rayleigh, Gamma and certain Beta distributions. The new scheme, referred to as Classification by Moments of Order Statistics (CMOS), has an accuracy that attains the Bayes' bound for symmetric distributions, and is, otherwise, very close to the optimal Bayes' bound, as has been shown both theoretically and by rigorous experimental testing. The results here also give a theoretical foundation for the families of Border Identification (BI) algorithms reported in the literature.
The gold standard for a classifier is the condition of optimality attained by the Bayesian classifier. Within a Bayesian paradigm, if we are allowed to compare the testing sample with only a single point in the feature space from each class, the optimal Bayesian strategy would be to achieve this based on the (Mahalanobis) distance from the corresponding means. The reader should observe that, in this context, the mean, in one sense, is the most central point in the respective distribution. In this paper, we shall show that we can obtain optimal results by operating in a diametrically opposite way, i.e., a so-called "anti-Bayesian" manner. Indeed, we assert a completely counter-intuitive result that by working with a very few points distant from the mean, one can obtain remarkable classification accuracies. The number of points can sometimes be as small as two. Further, if these points are determined by the Order Statistics of the distributions, the accuracy of our method, referred to as Classification by Moments of Order Statistics (CMOS), attains the optimal Bayes' bound. This claim, which is totally counter-intuitive, has been proven for many uni-dimensional, and some multi-dimensional distributions within the exponential family, and the theoretical results have been verified by rigorous experimental testing.Apart from the fact that these results are quite fascinating and pioneering in their own right, they also give a theoretical foundation for the families of Border Identification (BI) algorithms reported in the literature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.