2016
DOI: 10.1371/journal.pone.0157567
|View full text |Cite
|
Sign up to set email alerts
|

A New Algorithm to Optimize Maximal Information Coefficient

Abstract: The maximal information coefficient (MIC) captures dependences between paired variables, including both functional and non-functional relationships. In this paper, we develop a new method, ChiMIC, to calculate the MIC values. The ChiMIC algorithm uses the chi-square test to terminate grid optimization and then removes the restriction of maximal grid size limitation of original ApproxMaxMI algorithm. Computational experiments show that ChiMIC algorithm can maintain same MIC values for noiseless functional relat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
30
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 27 publications
(30 citation statements)
references
References 25 publications
0
30
0
Order By: Relevance
“…MIC can capture dependence between pairs of variables, including both functional and nonfunctional relationships. However, the ApproxMaxMI method provided by Reshef et al (2011) results in a larger MIC score for paired variables under finite-sample conditions (Chen et al, 2016). Here, we use the improved algorithm ChiMIC to calculate the MIC value (Chen et al, 2016).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…MIC can capture dependence between pairs of variables, including both functional and nonfunctional relationships. However, the ApproxMaxMI method provided by Reshef et al (2011) results in a larger MIC score for paired variables under finite-sample conditions (Chen et al, 2016). Here, we use the improved algorithm ChiMIC to calculate the MIC value (Chen et al, 2016).…”
Section: Methodsmentioning
confidence: 99%
“…However, the ApproxMaxMI method provided by Reshef et al (2011) results in a larger MIC score for paired variables under finite-sample conditions (Chen et al, 2016). Here, we use the improved algorithm ChiMIC to calculate the MIC value (Chen et al, 2016). The NDC score for a pair of data series x (gene) and y (phenotype) is defined as follows:…”
Section: Methodsmentioning
confidence: 99%
“…Given n = 100, the MIC score for independent paired variables should be zero, and the corresponding partition should be a 2 × 2 grid. However, the ApproxMaxMI algorithm tends to fall into the maximal grid size (100 0.6 ≈ 16), the corresponding partition is a 2 × 8 grid and the corresponding MIC score is 0.24, which leads to a nontrivial MIC score for independent paired variables under finite samples [40]. Recently, Chen et al [40] presented the ChiMIC algorithm, which can control the excessive grid partitions of the ApproxMaxMI algorithm.…”
Section: Datasets and Methodsmentioning
confidence: 99%
“…However, the ApproxMaxMI algorithm tends to fall into the maximal grid size (100 0.6 ≈ 16), the corresponding partition is a 2 × 8 grid and the corresponding MIC score is 0.24, which leads to a nontrivial MIC score for independent paired variables under finite samples [40]. Recently, Chen et al [40] presented the ChiMIC algorithm, which can control the excessive grid partitions of the ApproxMaxMI algorithm. Removing the maximal grid size limitation in ApproxMaxMI, ChiMIC uses a chi-square test based on a local r × 2 grid to determine whether the new endpoint should be introduced.…”
Section: Datasets and Methodsmentioning
confidence: 99%
“…Although MIC has gained considerable attention (Nature, 2012;Speed, 2011;Zhang et al, 2014), there were also several discussions about some of its properties (N. Simon, 2011;Kinney and Atwal, 2014;Reshef et al, 2014). One of the main issues resides in the computational cost of MIC's original implementation: a dynamic programming algorithm called ApproxMaxMI that several studies in the literature tried to optimize (Albanese et al, 2013;Zhang et al, 2014;Tang et al, 2014;Chen et al, 2016). Apart from these issues, all the mentioned methods need categorical data to be converted to numerical in order to be applied, which cannot be done in many cases with non-ordinal variables.…”
Section: Introductionmentioning
confidence: 99%