2015
DOI: 10.5120/ijca2015906891
|View full text |Cite
|
Sign up to set email alerts
|

Data Classification based on Decision Tree, Rule Generation, Bayes and Statistical Methods: An Empirical Comparison

Abstract: In this paper, twenty well known data mining classification methods are applied on ten UCI machine learning medical datasets and the performance of various classification methods are empirically compared while varying the number of categorical and numeric attributes, the types of attributes and the number of instances in datasets. In the performance study, Classification Accuracy (CA), Root Mean Square Error (RMSE) and Area Under Curve (AUC) of Receiver"s Operational Characteristics (ROC) is used as the metric… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 25 publications
0
4
0
Order By: Relevance
“…The value of P ( 1 … ) is constant for each experiment so that the maximum value of a class is determined by the maximum value between P (A) P ( 1 … | A) [15]. The equation function formed becomes a maximum multiplication for the prior value and the likelihood function, the function is shown in Equation 3.…”
Section: Naïve Bayes Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The value of P ( 1 … ) is constant for each experiment so that the maximum value of a class is determined by the maximum value between P (A) P ( 1 … | A) [15]. The equation function formed becomes a maximum multiplication for the prior value and the likelihood function, the function is shown in Equation 3.…”
Section: Naïve Bayes Methodsmentioning
confidence: 99%
“…From the training data set that is applied to the model, the confusion matrix produces several values, namely Accuracy, Precision, Recall, and F-Measure. Accuracy is the percentage of cases that have predictions and real values are both positive (TP) orboth negative (TN) compared to the total number of cases [15]. Precision or confidence is the ratio between cases that have predictions and real values are both positive (TP) compared to the overall positive predicted cases (TP + FP).…”
Section: Confusion Matrix Testingmentioning
confidence: 99%
“…For Regression Classification numerical attribute datasets, NBTree and multiclass Classifier are the best methods. For the categorical attributes of the NB-Tree dataset, Classification via Regression and Bayes Net methods is best [9]. Of these above five rules classification method based on PART and Decision Tree method is the best.…”
Section: Introductionmentioning
confidence: 99%
“…In the paper [7], twenty well known data mining classification methods are applied on ten UCI machine learning medical datasets, and the performance of various classification methods are empirically compared while varying the number categorical and numeric attributes. The types of attributes and the number of instances in datasets.…”
Section: Related Workmentioning
confidence: 99%