In the field of classification, the main task of most algorithms is to find a perfect decision boundary. However, most decision boundaries are too complex to be discovered directly. Therefore, in this paper, we proposed an Incremental Maximum Gaussian Mixture Partition (IMGMP) algorithm for classification, aiming to solve those problems with complex decision boundaries. As a self-adaptive algorithm, it uses a divide and conquer strategy to calculate out a reasonable decision boundary by step. An Improved K-means clustering and a Maximum Gaussian Mixture model are used in the classifier. This algorithm also has been tested on artificial and real-life datasets in order to evaluate its remarkable flexibility and robustness.
IntroductionN the field of machine learning, Artificial Neuron Network, Support Vector Machine [1] (SVM) and other plenty of algorithms are used for solving classification problem [2]. A common idea of these algorithms is trying to find a decision boundary among two classes [3]. However, in practice, discovering of decision boundary is always a tricky question [4,5]. First of all, most of datasets are non-linear separated and the decision boundaries are complex [6,7]. If a dataset's dimension is more than two, its decision boundary is in fact a decision surface and becomes more complex [6,8]. In addition, the boundaries of different classes could intersect in some cases. Though many algorithms can discover complex decision boundaries, they still cannot satisfy all requirements. For instant, SVM [1,9] can improve kernel function to approximately close to the real decision boundary, but few kernel functions cannot satisfy various tasks [10]. In order to flexibly represent the decision boundary, researchers try to combine a set of boundaries to simulate the actual boundary. For instance, Jinghao and Binge use a set of hyper-planes [11] or hyper-spheres [12,13] [12,13] uses GA or PSO to dynamically search the whole data space to find the clusters of each class and then a group of hyper spheres will be employed to represent the clusters as classifiers. Research also describes above method as an expert system. Each cluster will be assigned to an expert, which knows the boundary of the corresponding cluster and the whole experts imply the intact decision surfaces. An expert indicates a range of a class's patterns, and the pattern in this range will be regarded as the same class.Above methods make sense in the practice, but they still have some limitations. First of all, the shape of clusters is complex, so it is not enough if we only use hyper-plane and hyper-sphere to represent it. Secondly, the boundaries of experts are absolute. Once the boundaries of the two classes are mixed, there will be some problems [15]. Finally, there are gap areas outsides the experts and it is difficult to judge whether the patterns fall in the gap.To solve the problems above, IMGMP of this paper uses the ideal of fuzzy learning [16,17] in the expert system. IMGMP will employ Gaussian model [18] as classifiers in t...