Mahalanobis-Taguchi System (MTS), as a pattern recognition method by constructing a continuous measurement scale, has a very good performance on classification and feature selection for real-valued data. However, the record of symbolic interval data has become a common practice with the recent advances in database technologies. Kernel methods not only are powerful statistical nonlinear learning methods, but also can be defined over objects as diverse as graphs, sets, strings, and text documents. In this paper, we derive kernel Mahalanobis distance (KMD) to extend MTS to symbolic interval data. To evaluate the proposed method, four experiments with synthetic symbolic interval data sets and seven experiments with real symbolic interval data sets are performed and we have compared our method with MTS based on interval Mahalanobis distance (IMD). The experimental results show our method has a better classification performance than MTS based on IMD on Accuracy, Specificity, Sensitivity, and G-means. However, MTS based on IMD has a stronger dimension reduction rate than our method.
The Mahalanobis-Taguchi System (MTS) is a new classifying and diagnostic technique of pattern recognition using a collection of methods of Mahalanobis distance, orthogonal arrays and signal-to-noise ratios. The advantages of MTS, such as the ability of choosing the effective variables, no need to make assumptions about data distribution and high classification speed, etc., make it a wide range of applications in areas, including industrial production, business management and pattern recognition. However, as a relatively new classification method, it is also controversial to construct a reference space by using the method of orthogonal arrays and signal-to-noise ratios. For the shortage of MTS, an optimization model of reference space is constructed by hybrid encoding genetic algorithms, built decision model which aimed to minimize the error rate of classification and maximize the signal-to-noise ratio, then give the initial solution to the problem by using uniform distribution strategy, and then the method of genetic algorithm is introduced to obtain further improvements. The experiment results show the good convergence and effectiveness of the model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.