2016
DOI: 10.18201/ijisae.267490
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of the effect of unsupervised and supervised discretization methods on classification process

Abstract: Most of the machine learning and data mining algorithms use discrete data for the classification process. But, most data in practice include continuous features. Therefore, a discretization pre-processing step is applied on these datasets before the classification. Discretization process converts continuous values to discrete values. In the literature, there are many methods used for discretization process. These methods are grouped as supervised and unsupervised methods according to whether a class informatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 5 publications
0
5
0
Order By: Relevance
“…The selection and elimination stages allowed a cleaner data set, free of inconsistencies and empty records. In the transformation stage, a discretization method [24] was applied to convert the continuous variables to ordinals. Socioeconomic data (age, income, disability) and academic data (percentages absent and grades) were discretized in a personalized way using interval ranges established by the institution; for example, with a percentage of non-attendance greater than 30%, the student loses the course [5].…”
Section: Data Processingmentioning
confidence: 99%
“…The selection and elimination stages allowed a cleaner data set, free of inconsistencies and empty records. In the transformation stage, a discretization method [24] was applied to convert the continuous variables to ordinals. Socioeconomic data (age, income, disability) and academic data (percentages absent and grades) were discretized in a personalized way using interval ranges established by the institution; for example, with a percentage of non-attendance greater than 30%, the student loses the course [5].…”
Section: Data Processingmentioning
confidence: 99%
“…To select the candidate discretizers for this study, related literature comparing various discretizers has shown that supervised discretization methods usually perform better than unsupervised ones [ 5 , 29 ]. Moreover, recent comparative studies focusing on the data discretization task employed the MDLP and ChiMerge discretizers [ 24 , 30 ].…”
Section: Literature Reviewmentioning
confidence: 99%
“…When an attribute is continuous, it makes the model building difficult. Hence, the preprocessing step is primordial before building classification patterns in order to maximize the predictive accuracy [33]. In particular, a discretization method is employed to tackle this limitation.…”
Section: A Network Patterns Analysismentioning
confidence: 99%