2022
DOI: 10.3390/e24081079
|View full text |Cite
|
Sign up to set email alerts
|

Information Theoretic Methods for Variable Selection—A Review

Abstract: We review the principal information theoretic tools and their use for feature selection, with the main emphasis on classification problems with discrete features. Since it is known that empirical versions of conditional mutual information perform poorly for high-dimensional problems, we focus on various ways of constructing its counterparts and the properties and limitations of such methods. We present a unified way of constructing such measures based on truncation, or truncation and weighing, for the Möbius e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 67 publications
0
4
0
Order By: Relevance
“…Along with dealing with the deterministic number of selected factors, there is a research approach based on developing the rules for stopping the procedures used to identify the relevant set. In this regard, we indicate, e.g., article [52], dedicated to information methods for selecting relevant factors. The study of non-discrete stochastic models is also of undoubted interest, see, e.g., [53].…”
Section: Discussionmentioning
confidence: 99%
“…Along with dealing with the deterministic number of selected factors, there is a research approach based on developing the rules for stopping the procedures used to identify the relevant set. In this regard, we indicate, e.g., article [52], dedicated to information methods for selecting relevant factors. The study of non-discrete stochastic models is also of undoubted interest, see, e.g., [53].…”
Section: Discussionmentioning
confidence: 99%
“…Redundancy is a problem widely studied by researchers, and several information‐theoretic methods have been proposed 19,27–29 in this context. Information theory‐based methods intend to select a subset S of features from the original feature space F, that can optimize the MI Ifalse(S;Cfalse)$$ I\left(S;C\right) $$ while minimizing the redundancy level.…”
Section: Related Workmentioning
confidence: 99%
“…The paper suggests using Delayed Mutual Information (Delayed Mutual Information (DMI)) to incorporate the slight time delays among brain areas inside the BN method because Information Theoretic approaches do not make any hypothesis about the dependency between time series [39]. By simulating the delays found in the brain, we hypothesize that we can create a more representational model for a Bayesian Network.…”
Section: Introductionmentioning
confidence: 99%