Discretization of data using Boolean transformations and information theory based evaluation criteria

Jankowski, C.; Reda, Dhilal M.; Mankowski, Michal; Borowik, Grzegorz

doi:10.1515/bpasts-2015-0105

Cited by 5 publications

(4 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Some heuristic measures, e.g. entropy [34], can be used to determine the best cut-points. In the case of unsupervised methods, information about class labels is omitted during discretisation.…”

Section: Objectives and Algorithms Of Discretisationmentioning

confidence: 99%

Bulletin of the Polish Academy of Sciences: Technical Sciences

Stańczyk,

Zielosko

2021

View full text Add to dashboard Cite

When patterns to be recognised are described by features of continuous type, discretisation becomes either an optional or necessary step in the initial data pre-processing stage. Characteristics of data, distribution of data points in the input space, can significantly influence the process of transformation from real-valued into nominal attributes, and the resulting performance of classification systems employing them. If data include several separate sets, their discretisation becomes more complex, as varying numbers of intervals and different ranges can be constructed for the same variables. The paper presents research on irregularities in data distribution, observed in the context of discretisation processes. Selected discretisation methods were used and their effect on the performance of decision algorithms, induced in classical rough set approach, was investigated. The studied input space was defined by measurable style-markers, which, exploited as characteristic features, facilitate treating a task of stylometric authorship attribution as classification.

show abstract

Section: Objectives and Algorithms Of Discretisationmentioning

confidence: 99%

Bulletin of the Polish Academy of Sciences: Technical Sciences

Stańczyk,

Zielosko

2021

View full text Add to dashboard Cite

show abstract

“…Assuming that a spike train is being recorded within some time interval, so that in each time slice, a spike is either present or absent, it is natural and justified to represent the spike train as a sequence of bits [32]. Discretisation is one of the most important points of neural communication [33]. Such discretisation allows us to treat both a neuron's stimuli and its response strictly as binary stochastic processes.…”

Section: Theory and Modelsmentioning

confidence: 99%

Bulletin of the Polish Academy of Sciences: Technical Sciences

Paprocki¹,

Pręgowska²,

Szczepański³

2020

View full text Add to dashboard Cite

The way brain networks maintain high transmission efficiency is believed to be fundamental in understanding brain activity. Brains consisting of more cells render information transmission more reliable and robust to noise. On the other hand, processing information in larger networks requires additional energy. Recent studies suggest that it is complexity, connectivity, and function diversity, rather than just size and the number of neurons, that could favour the evolution of memory, learning, and higher cognition. In this paper, we use Shannon information theory to address transmission efficiency quantitatively. We describe neural networks as communication channels, and then we measure information as mutual information between stimuli and network responses. We employ a probabilistic neuron model based on the approach proposed by Levy and Baxter, which comprises essential qualitative information transfer mechanisms. In this paper, we overview and discuss our previous quantitative results regarding brain-inspired networks, addressing their qualitative consequences in the context of broader literature. It is shown that mutual information is often maximized in a very noisy environment e.g., where only one-third of all input spikes are allowed to pass through noisy synapses and farther into the network. Moreover, we show that inhibitory connections as well as properly displaced long-range connections often significantly improve transmission efficiency. A deep understanding of brain processes in terms of advanced mathematical science plays an important role in the explanation of the nature of brain efficiency. Our results confirm that basic brain components that appear during the evolution process arise to optimise transmission performance.

show abstract

“…Intensive research is being conducted to employ data preprocessing for larger datasets. This is particularly evident in biomedicine where data was gathered for thousands of variables, which are used to analyze the medical condition of patients [3]. Missing values are very common in the process and they can have a direct impact on the dataset as a whole.…”

Section: Introductionmentioning

confidence: 99%

Provissional Access For Improving Classification Accuracy On Diabetes Dataset

Sumathi¹,

Meganathan²,

Revathi³

2019

IJEAT

View full text Add to dashboard Cite

Data mining helps to solve many problems in the area of medical diagnosis using real-world data. However, much of the data is unrealizable as it does not have desirable features and contains a lot of gaps and errors. A complete set of data is a prerequisite for precise grouping and classification of a dataset. Preprocessing is a data mining technique that transforms the unrefined dataset into reliable and useful data. It is used for resolving the issues and changes raw data for next level processing. Discretization is a necessary step for data preprocessing task. It reduces the large chunks of numeric values to a group of well-organized values. It offers remarkable improvements in speed and accuracy in classification. This paper investigates the impact of preprocessing on the classification process. This work implements three techniques such as NaiveBayes, Logistic Regression, and SVM to classify Diabetes dataset. The experimental system is validated using discretize techniques and various classification algorithms.

show abstract

Discretization of data using Boolean transformations and information theory based evaluation criteria

Cited by 5 publications

References 32 publications

Bulletin of the Polish Academy of Sciences: Technical Sciences

Bulletin of the Polish Academy of Sciences: Technical Sciences

Bulletin of the Polish Academy of Sciences: Technical Sciences

Provissional Access For Improving Classification Accuracy On Diabetes Dataset

Contact Info

Product

Resources

About