In genetic epidemiological studies, family history data are collected on
relatives of study participants and used to estimate the age-specific risk of
disease for individuals who carry a causal mutation. However, a family
member’s genotype data may not be collected due to the high cost of
in-person interview to obtain blood sample or death of a relative. Previously,
efficient nonparametric genotype-specific risk estimation in censored mixture
data has been proposed without considering covariates. With multiple predictive
risk factors available, risk estimation requires a multivariate model to account
for additional covariates that may affect disease risk simultaneously.
Therefore, it is important to consider the role of covariates in the
genotype-specific distribution estimation using family history data. We propose
an estimation method that permits more precise risk prediction by controlling
for individual characteristics and incorporating interaction effects with
missing genotypes in relatives, and thus gene-gene interactions and
gene-environment interactions can be handled within the framework of a single
model. We examine performance of the proposed methods by simulations and apply
them to estimate the age-specific cumulative risk of Parkinson’s disease
(PD) in carriers of LRRK2 G2019S mutation using first-degree
relatives who are at genetic risk for PD. The utility of estimated carrier risk
is demonstrated through designing a future clinical trial under various
assumptions. Such sample size estimation is seen in the Huntington’s
disease literature using the length of abnormal expansion of a CAG repeat in the
HTT gene, but is less common in the PD literature.