2011
DOI: 10.1002/gepi.20621
|View full text |Cite
|
Sign up to set email alerts
|

Entropy-based information gain approaches to detect and to characterize gene-gene and gene-environment interactions/correlations of complex diseases

Abstract: For complex diseases, the relationship between genotypes, environment factors and phenotype is usually complex and nonlinear. Our understanding of the genetic architecture of diseases has considerably increased over the last years. However, both conceptually and methodologically, detecting gene-gene and gene-environment interactions remains a challenge, despite the existence of a number of efficient methods. One method that offers great promises but has not yet been widely applied to genomic data is the entrop… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
61
0

Year Published

2013
2013
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 62 publications
(62 citation statements)
references
References 37 publications
1
61
0
Order By: Relevance
“…Therefore, in addition to the mean-based measures (i.e., c-index or IDI) that would be subject to a ceiling effect, we measured the enhancement to the HCC predictive model by the increase in conditional variance of predicted HCC probabilities within AFP strata 13 and the information gain (Kullback-Leibler divergence) 14 , which quantifies the increase in heterogeneity of the distribution of HCC cases within AFP-strata. This method has been recently applied in biomedical sciences 15-17 . These two concepts are used to define a preferred sequence of attributes; an attribute with high mutual information should be preferred to other attributes.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, in addition to the mean-based measures (i.e., c-index or IDI) that would be subject to a ceiling effect, we measured the enhancement to the HCC predictive model by the increase in conditional variance of predicted HCC probabilities within AFP strata 13 and the information gain (Kullback-Leibler divergence) 14 , which quantifies the increase in heterogeneity of the distribution of HCC cases within AFP-strata. This method has been recently applied in biomedical sciences 15-17 . These two concepts are used to define a preferred sequence of attributes; an attribute with high mutual information should be preferred to other attributes.…”
Section: Discussionmentioning
confidence: 99%
“…However, these conventional measures of discrimination may be insensitive to detecting small improvements in model performance when a new marker is added to a model that already includes important predictors. We therefore also measured changes in conditional variance of predicted HCC probabilities within AFP strata 13 ; this is a component of the total variance (in addition the conditional means variance which is captured by c index); and the information-theoretic information gain (Kullback-Leibler divergence) 14 , which quantifies the increase in heterogeneity of the distribution of HCC cases within AFP-strata 15 . Calibration and model fit were assessed graphically by plotting model-derived probabilities against raw probabilities, and analytically by the Hosmer-Lemeshow chi-square statistic.…”
Section: Methodsmentioning
confidence: 99%
“…Given a dataset D with N samples and M SNPs, the clustering process is described as follows:Initialization: k SNPs are randomly selected from M SNPs as initial centroids of k clusters Cj(j=1,2,,k), and k is the preset number of SNP groups.Clustering: Mutual information can measure the dependency or associativity between two variables [40,41]. Given this, we take mutual information to measure the associativity between two SNPs.…”
Section: Methodsmentioning
confidence: 99%
“…After dividing all SNPs into k clusters, ClusterMI then applies conditional mutual information to screen two-locus combinations in each cluster as follows:For the m -th cluster, the association between a two-locus combination and the disease can be measured by conditional mutual information [40,41]. The conditional mutual information of a two-locus combination (Si,Sj)(Si,SjGm) under case ( y = 1) can be calculated as: cMI(Si,Sj)=u=13v=13P(Si=u,Sj=v|y=1)logP(Si=u,Sj=v|y=1)P(Si=u|y=1)P(Sj=v|y=1) where P(Si=u,Sj=v|y=1) denotes the joint probability of Si and Sj under the case; P(Si=u|y=1) and P(Sj=v|y=1) are the marginal probability of Si and Sj under the case, respectively.…”
Section: Methodsmentioning
confidence: 99%
“…Learning strategies have been applied for epistasis estimates in the context of big data, such as Machine Learning (ML) decision trees [4, 14], information theory [8, 25] and multifactor dimensionality reduction (MDR) [28]. In the statistical framework, mixed models based on likelihood inference have been used to estimate epistatic effects using animal models and epistatic G-BLUP based on genomic additive and dominant matrices.…”
Section: Introductionmentioning
confidence: 99%