2015
DOI: 10.1515/bpasts-2015-0105
|View full text |Cite
|
Sign up to set email alerts
|

Discretization of data using Boolean transformations and information theory based evaluation criteria

Abstract: Abstract. Discretization is one of the most important parts of decision table preprocessing. Transforming continuous values of attributes into discrete intervals influences further analysis using data mining methods. In particular, the accuracy of generated predictions is highly dependent on the quality of discretization. The paper contains a description of three new heuristic algorithms for discretization of numeric data, based on Boolean reasoning. Additionally, an entropy-based evaluation of discretization … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 32 publications
0
2
0
Order By: Relevance
“…Some heuristic measures, e.g. entropy [34], can be used to determine the best cut-points. In the case of unsupervised methods, information about class labels is omitted during discretisation.…”
Section: Objectives and Algorithms Of Discretisationmentioning
confidence: 99%
“…Some heuristic measures, e.g. entropy [34], can be used to determine the best cut-points. In the case of unsupervised methods, information about class labels is omitted during discretisation.…”
Section: Objectives and Algorithms Of Discretisationmentioning
confidence: 99%
“…Assuming that a spike train is being recorded within some time interval, so that in each time slice, a spike is either present or absent, it is natural and justified to represent the spike train as a sequence of bits [32]. Discretisation is one of the most important points of neural communication [33]. Such discretisation allows us to treat both a neuron's stimuli and its response strictly as binary stochastic processes.…”
Section: Theory and Modelsmentioning
confidence: 99%
“…Intensive research is being conducted to employ data preprocessing for larger datasets. This is particularly evident in biomedicine where data was gathered for thousands of variables, which are used to analyze the medical condition of patients [3]. Missing values are very common in the process and they can have a direct impact on the dataset as a whole.…”
Section: Introductionmentioning
confidence: 99%