Thus far, the Universal Law of Gravitation has found application in many issues related to pattern classification. Its popularity results from its clear theoretical foundations and the competitive effectiveness of the classifiers based on it. Both Moons and Circles data sets constitute distinctive types of data sets that can be found in machine learning. Despite the fact that they have not been formally defined yet, on the basis of their visualization, they can be defined as sets in which the distribution of objects of individual classes creates shapes similar to circles or semicircles. This article makes an attempt to improve the gravitational classifier that creates a data particle based on the class. The aim was to compare the effectiveness of the developed Geometrical Divide method with the popular method of creating a class-based data particle, which is described by a compound of 1 ÷ 1 cardinality in the Moons and Circles data sets classification process. The research made use of eight artificially generated data sets, which contained classes that were explicitly separated from each other as well as data sets with objects of different classes that did overlap each other. Within the limits of the conducted experiments, the Geometrical Divide method was combined with several algorithms for determining the mass of a data particle. The research did also use the k-Fold Cross-Validation. The results clearly showed that the proposed method is an efficient approach in the Moons and Circles data sets classification process. The conclusion section of the article elaborates on the identified advantages and disadvantages of the method as well as the possibilities of further research and development.
The history of gravitational classification started in 1977. Over the years, the gravitational approaches have reached many extensions, which were adapted into different classification problems. This article is the next stage of the research concerning the algorithms of creating data particles by their geometrical divide. In the previous analyses it was established that the Geometrical Divide (GD) method outperforms the algorithm creating the data particles based on classes by a compound of 1 ÷ 1 cardinality. This occurs in the process of balanced data sets classification, in which class centroids are close to each other and the groups of objects, described by different labels, overlap. The purpose of the article was to examine the efficiency of the Geometrical Divide method in the unbalanced data sets classification, by the example of real case-occupancy detecting. In addition, in the paper, the concept of the Unequal Geometrical Divide (UGD) was developed. The evaluation of approaches was conducted on 26 unbalanced data sets-16 with the features of Moons and Circles data sets and 10 created based on real occupancy data set. In the experiment, the GD method and its unbalanced variant (UGD) as well as the 1CT1P approach, were compared. Each method was combined with three data particle mass determination algorithms-n-Mass Model (n-MM), Stochastic Learning Algorithm (SLA) and Bath-update Algorithm (BLA). k-fold cross validation method, precision, recall, F-measure, and number of used data particles were applied in the evaluation process. Obtained results showed that the methods based on geometrical divide outperform the 1CT1P approach in the imbalanced data sets classification. The article’s conclusion describes the observations and indicates the potential directions of further research and development of methods, which concern creating the data particle through its geometrical divide.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.