Can high-order dependencies improve mutual information based feature selection?

Vinh, Nguyen X.; Zhou, Shuo; Chan, Jeffrey; Bailey, James E.

doi:10.1016/j.patcog.2015.11.007

Cited by 95 publications

(82 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, JMI [19] criteria can be derived setting the value of β = γ = 1 |S| . In [17], the authors propose a new criterion by relaxing the the first assumption. They show under the relaxed assumption that the selected features are conditionally independent given the f m and another feature f i in S, the redundancy term can be approximated as the following…”

Section: Information Theoretic Feature Selection Methodsmentioning

confidence: 99%

“…Otherwise, f m is discarded considering that it does not contribute to the score significantly. While selecting a new feature, its discretization level is also shifted by a small value δ from its original value (as selected previously based on J rel as shown in line [16][17][18][19][20][21]. This process helps to select the discretization level of features dynamically considering its dependency with other feature.…”

Section: Discretization and Feature Selection Based On Bias Correctedmentioning

confidence: 99%

“…However, the computation of I(S; C) is a NP-hard problem [7]. To overcome this problem, different approximations such as MIFS [1], mRMR [10], JMI [19], RelaxMRMR [17] have been proposed over the last decades. In these methods, MI terms such as feature relevancy(R), redundancy(r), conditional redundancy(c) and interaction(i) are considered in order to achieve a better approximation.…”

mentioning

confidence: 99%

“…In a recent method mDSM [16], it is shown that incorporating bias correction for R, r, and c terms improves the classification performance. However, the interaction term is not considered in mDSM which needs to be addressed for better approximation [17].…”

mentioning

confidence: 99%

See 3 more Smart Citations

Discretization and Feature Selection Based on Bias Corrected Mutual Information Considering High-Order Dependencies

Roy

Sharmin

Ali

et al. 2020

Advances in Knowledge Discovery and Data Mining

View full text Add to dashboard Cite

Mutual Information (MI) based feature selection methods are popular due to their ability to capture the nonlinear relationship among variables. However, existing works rarely address the error (bias) that occurs due to the use of finite samples during the estimation of MI. To the best of our knowledge, none of the existing methods address the bias issue for the high-order interaction term which is essential for better approximation of joint MI. In this paper, we first calculate the amount of bias of this term. Moreover, to select features using χ 2 based search, we also show that this term follows χ 2 distribution. Based on these two theoretical results, we propose Discretization and feature Selection based on bias corrected Mutual information (DSbM). DSbM is extended by adding simultaneous forward selection and backward elimination (DSbM fb ). We demonstrate the superiority of DSbM over four state-of-the-art methods in terms of accuracy and the number of selected features on twenty benchmark datasets. Experimental results also demonstrate that DSbM outperforms the existing methods in terms of accuracy, Pareto Optimality and Friedman test. We also observe that compared to DSbM, in some dataset DSbM fb selects fewer features and increases accuracy.

show abstract

Section: Information Theoretic Feature Selection Methodsmentioning

confidence: 99%

Section: Discretization and Feature Selection Based On Bias Correctedmentioning

confidence: 99%

mentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

Discretization and Feature Selection Based on Bias Corrected Mutual Information Considering High-Order Dependencies

Roy

Sharmin

Ali

et al. 2020

Advances in Knowledge Discovery and Data Mining

View full text Add to dashboard Cite

show abstract

“…107 Since MI based filter methods do not extract new features and thus are more 108 interpretable, parallel to the development in Deep learning, there has been a lot of effort 109 to better approximate MI measures such as relevancy and redundancy. New Information 110 theoretic measures such as complementary information, the additional information that 111 a gene has about the class, which is not found in the already selected subset of genes 112 have been proposed [15,27]. These methods attempt to estimate the joint mutual 113 information of a feature subset with the class.…”

Section: Introduction 18mentioning

confidence: 99%

Use of relevancy and complementary information for discriminatory gene selection from high-dimensional cancer data

Haque

Sharmin

Ali

et al. 2020

Preprint

View full text Add to dashboard Cite

With the advent of high-throughput technologies, life sciences are generating a huge 1 amount of biomolecular data. Global gene expression profiles provide a snapshot of all 2 the genes that are transcribed or not in a cell or in a tissue at a particular moment 3 under a particular condition. The high-dimensionality of such gene expression data 4 (i.e., very large number of features/genes analyzed in relatively much less number of 5 samples) makes it difficult to identify the key genes (biomarkers) that are truly and 6 more significantly attributing to a particular phenotype or condition, such as cancer or 7 disease, de novo. With the increase in the number of genes, simple feature selection 8 methods show poor performance for both selecting the effective and informative features 9 and capturing biological information. Addressing these issues, here we propose Mutual 10 information based Gene Selection method (M GS) for selecting informative genes and 11 two ranking methods based on frequency (M GS f ) and Random Forest (M GS rf ) for 12 ranking the selected genes. We tested our methods on four real gene expression datasets 13 derived from different studies on cancerous and normal samples. Our methods obtained 14 better classification rate with the datasets compared to recently reported methods. Our 15 methods could also detect the key relevant pathways with a causal relationship to the 16 phenotype. 17 25 indicator of a particular state). Identification of these informative genes is very 26 important for elucidating developmental and disease mechanisms, disease diagnosis, 27 February 22, 2020 1/15drug development, etc. Especially, for different cancer diseases, these informative genes 28 may be invaluable for the improvement of diagnosis, prognosis, and treatment. 29Usually, studies to generate cancer specific gene expression profiles comprise a small 30 number of control and patient samples in comparison to tens of thousands of genes 31 (high dimensionality of the data) in each sample where only a few numbers of genes are 32 responsible for a disease. From a large set of genes, identification of a subset that is 33 differently expressed in cancerous cells compared to the normal ones, is a challenging 34 task and is considered as NP hard or NP-complete [1]. Therefore, the feature/gene 35 selection methods can be a useful way to identify a subset of genes relevant to particular 36 cancer for better diagnosis and treatment. In this paper, we use the terms "gene" and 37 "feature" interchangeably. 38In bioinformatics, several gene selection methods have been proposed, particularly 39 for cancer data classification [2][3][4]. "Wrapper"and "Filter"are two popular categories of 40 feature selection methods [5] where wrapper methods are classifier dependent and filter 41 methods are classifier independent and their performance mainly depends on the 42 selection of a criterion. Wrapper based methods select the most discriminant subset of 43 features by minimizing the prediction error of a particular classifier [6]. Suppor...

show abstract

Controlling Costs in Feature Selection: Information Theoretic Approach

Teisseyre

Klonecki

2021

Computational Science – ICCS 2021

View full text Add to dashboard Cite

Can high-order dependencies improve mutual information based feature selection?

Cited by 95 publications

References 15 publications

Discretization and Feature Selection Based on Bias Corrected Mutual Information Considering High-Order Dependencies

Discretization and Feature Selection Based on Bias Corrected Mutual Information Considering High-Order Dependencies

Use of relevancy and complementary information for discriminatory gene selection from high-dimensional cancer data

Controlling Costs in Feature Selection: Information Theoretic Approach

Contact Info

Product

Resources

About