2006
DOI: 10.1158/1055-9965.epi-05-0717
|View full text |Cite
|
Sign up to set email alerts
|

Modeling Exposures for DNA Methylation Profiles

Abstract: We extend the finite mixture model to estimate the association between exposure and latent disease subtype measured by DNA methylation profiles. Estimates from this model are compared with those obtained from the simpler two-phase approach of first clustering the DNA methylation data followed by associating exposure with disease subtype using logistic regression. The two models are fit to data from a study of colorectal adenomas and are compared in a simulation study. Depending on the analytic approach, we obt… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2007
2007
2013
2013

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 20 publications
0
6
0
Order By: Relevance
“…In applications such as market segmentation (Wedel & Kamakura, 2000) usually subsets of features are included in the mixture components. Siegmund et al (2006) used an MOE model to estimate the association between exposure and latent disease subtype measured by DNA methylation profiles; only the mixing probabilities were modelled as a function of the $p$ potential exposures. In machine learning applications (Jacobs, Peng & Tanner, 1997) different subsets of the features are included in both the mixing probabilities and the mixture components.…”
Section: Introductionmentioning
confidence: 99%
“…In applications such as market segmentation (Wedel & Kamakura, 2000) usually subsets of features are included in the mixture components. Siegmund et al (2006) used an MOE model to estimate the association between exposure and latent disease subtype measured by DNA methylation profiles; only the mixing probabilities were modelled as a function of the $p$ potential exposures. In machine learning applications (Jacobs, Peng & Tanner, 1997) different subsets of the features are included in both the mixing probabilities and the mixture components.…”
Section: Introductionmentioning
confidence: 99%
“…In one such example, however, Siegmund et al (2006) used a finite mixture model to estimate the association between exposure and latent disease subtype measured by DNA methylation profiles and compared these results with a simpler two-phase approach, first clustering the DNA methylation data and then relating these clusters to exposure using logistic regression.…”
Section: Methodological Challengesmentioning
confidence: 99%
“…Using standard measurement error modeling approaches, one might then treat the latent process as being determined by one’s constitutional genotype G i and exposure history E i ( t ), through a linear longitudinal random-effects model, with the latent process in turn being a risk factor for disease state Y i ( t ), through a standard survival analysis model (see, for example, Elashoff et al 2007) for similar latent process methods applied to longitudinal data on CD4 cell counts in relation to AIDS incidence). To address the high-dimensional nature of epigenetic data, one could use the kinds of cluster-analysis techniques described by Siegmund et al (2006), treating the cluster rather than individual epigenetic marks, as a latent risk factor for disease (see Molitor et al 2003 for an example of similar latent cluster methods applied to haplotype associations).…”
Section: Methodological Challengesmentioning
confidence: 99%
“…The extended finite mixture model is fit by Siegmund et al (2006) using the EM algorithm. The parameter of interest is β, the log-odds ratio measuring the association between exposure and disease subtype.…”
Section: Modeling Exposures For Latent Disease Subtypesmentioning
confidence: 99%
“…Its standard error is computed using the observed information matrix as described by Louis (1982). Siegmund et al (2006) found that estimates from the extended finite-mixture model were unbiased and had the correct standard error estimates; however, adequate sample sizes were needed in order for the algorithm to converge. These results were compared to the naïve two-step analysis.…”
Section: Modeling Exposures For Latent Disease Subtypesmentioning
confidence: 99%