Mixed data with numeric variables and categorical variables are appearing in many areas such as credit scoring, medical diagnosis and manufacturing products. Most Classification techniques are suitable for each variable type of mixed data. For example, techniques using Euclidean distance are suitable for numerical variables and another techniques using symbolic logic are suitable for categorical variables. In this paper, we propose a hybrid method of classifiers to improve performance of the classification algorithm. Main idea is to deal with the categorical and numerical attributes separately with appropriate techniques. First, a whole data is partitioned into several subsets by applying decision tree only to categorical variables. Next, posterior probability is obtained by applying either k-NN or SVM to numerical variables in each leaf node of decision tree. Six data (Australian credit, German credit, Japan credit, Mammographic mass, churn, bank) of the UCI Machine Learning Repository are used to evaluate performance of the proposed hybrid classifier. Performance of a hybrid k-NN classifier is improved comparing with the k-NN. Performance of a hybrid SVM is slightly better than that of SVM. †
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.