Likelihood Ratio Tests in Rare Variant Detection for Continuous Phenotypes

Zeng, Ping; Zhao, Yang; Liu, Jin; Liu, Liya; Zhang, Liwei; Wang, Ting; Huang, Shuiping; Chen, Feng

doi:10.1111/ahg.12071

Cited by 23 publications

(32 citation statements)

References 48 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These results are consistent with those reported by Zeng et al [53] and by Lippert et al [30], who found the LRT to be generally more powerful than the score test across their simulated settings. Although Lippert et al did not consider the behavior of the two tests under misspecified weights, they reported the same pattern of results in real data analysis, where the LRT yielded consistently more associations than the score test.…”

Section: Resultssupporting

confidence: 92%

The Weighting is the Hardest Part: On the Behavior of the Likelihood Ratio Test and the Score Test Under a Data-Driven Weighting Scheme in Sequenced Samples

et al. 2017

View full text Add to dashboard Cite

Sequence-based association studies are at a critical inflexion point with the increasing availability of exome-sequencing data. A popular test of association is the sequence kernel association test (SKAT). Weights are embedded within SKAT to reflect the hypothesized contribution of the variants to the trait variance. Because the true weights are generally unknown, and so are subject to misspecification, we examined the efficiency of a data-driven weighting scheme. We propose the use of a set of theoretically defensible weighting schemes, of which, we assume, the one that gives the largest test statistic is likely to capture best the allele frequency-functional effect relationship. We show that the use of alternative weights obviates the need to impose arbitrary frequency thresholds in sequence data association analyses. As both the score test and the likelihood ratio test (LRT) may be used in this context, and may differ in power, we characterize the behavior of both tests. We found that the two tests have equal power if the set of weights resembled the correct ones. However, if the weights are badly specified, the LRT shows superior power (due to its robustness to misspecification). With this data-driven weighting procedure the LRT detected significant signal in genes located in regions already confirmed as associated with schizophrenia – the PRRC2A (P=1.020E-06) and the VARS2 (P=2.383E-06) – in the Swedish schizophrenia case-control cohort of 11,040 individuals with exome-sequencing data. The score test is currently preferred for its computational efficiency and power. Indeed, assuming correct specification, in some circumstances the score test is the most powerful. However, LRT has the advantageous properties of being generally more robust and more powerful under weight misspecification. This is an important result given that, arguably, misspecified models are likely to be the rule rather than the exception in weighting-based approaches.

show abstract

Section: Resultssupporting

confidence: 92%

The Weighting is the Hardest Part: On the Behavior of the Likelihood Ratio Test and the Score Test Under a Data-Driven Weighting Scheme in Sequenced Samples

et al. 2017

View full text Add to dashboard Cite

show abstract

“…In the bootstrap ReLRT algorithm, B was set to 2000, and for the approximation mixture in we selected L = 2000, 1500, 1000, 800, 500, 300 and 100. We also implemented the simulation-based algorithm for the finite sample null distribution of ReLRT [ 20 , 35 , 36 ], and the number of runs in this algorithm is set to 10000. Besides ReLRT, the burden test, the optimal SKAT (SKAT-O) [ 37 , 38 ], SKAT [ 12 ], the genetic random field (GenRF) model [ 39 , 40 ] and the mixed effects score test (MiST) [ 41 ] were conducted together for comparisons.…”

Section: Settings Of Numerical Studymentioning

confidence: 99%

Bootstrap Restricted Likelihood Ratio Test for the Detection of Rare Variants

Zeng¹,

Wang

2015

Self Cite

View full text Add to dashboard Cite

In this paper the detection of rare variants association with continuous phenotypes of interest is investigated via the likelihood-ratio based variance component test under the framework of linear mixed models. The hypothesis testing is challenging and nonstandard, since under the null the variance component is located on the boundary of its parameter space. In this situation the usual asymptotic chisquare distribution of the likelihood ratio statistic does not necessarily hold. To circumvent the derivation of the null distribution we resort to the bootstrap method due to its generic applicability and being easy to implement. Both parametric and nonparametric bootstrap likelihood ratio tests are studied. Numerical studies are implemented to evaluate the performance of the proposed bootstrap likelihood ratio test and compare to some existing methods for the identification of rare variants. To reduce the computational time of the bootstrap likelihood ratio test we propose an effective approximation mixture for the bootstrap null distribution. The GAW17 data is used to illustrate the proposed test.

show abstract

“…For example, it is well known that single nucleotide polymorphisms (SNPs) can be divided into groups in terms of functional annotations or genes, and genes in turn can be grouped into pathways due to the shared biological function. It has been shown that incorporating such useful group/functional information into model fitting can substantially boost statistical power in genetic association studies and can facilitate our understanding of the genetic architecture of disease variation by heritability partition [25][26][27][28][29][30][31][32][33]. One widely-used group source is the pathway information in the Kyoto Encyclopedia of Genes and Genomes (KEGG) [34,35], which integrates information on genomic, chemical and system functions and groups genes with highly related sequences by analyzing the sequence similarity of genes.…”

Section: Introductionmentioning

confidence: 99%

Jackknife model averaging prediction methods for complex phenotypes with gene expression levels by integrating external pathway information

Xiao

Zeng

et al. 2018

Preprint

Self Cite

View full text Add to dashboard Cite

Motivation:In the past few years many novel prediction approaches have been proposed and widely employed in high dimensional genetic data for disease risk evaluation. However, those approaches typically ignore in model fitting the important group structures or functional classifications that naturally exists in genetic data.Methods: In the present study, we applied a novel model averaging approach, called Jackknife Model Averaging Prediction (JMAP), for high dimensional genetic risk prediction while incorporating KEGG pathway information into the model specification. JMAP selects the optimal weights across candidate models by minimizing a cross-validation criterion in a jackknife way. Compared with previous approaches, one of the primary features of JMAP is to allow model weights to vary from 0 to 1 but without the limitation that the summation of weights is equal to one.We evaluated the performance of JMAP using extensive simulation studies and compared it with existing methods. We finally applied JMAP to five real cancer datasets that are publicly available from TCGA. Results:The simulations showed that, compared with other existing approaches, JMAP performed best or are among the best methods across a range of scenarios. For example, among 14 out of 16 simulation settings with PVE=0.3, JMAP has an average of 0.075 higher prediction accuracy compared with gsslasso. We further found that in the simulation the model weights for the true candidate models have much smaller chances to be zero compared with those for the null candidate models and are substantially greater in magnitude. In the real data application, JMAP also behaves comparably or better compared with the other methods for both continuous and binary phenotypes. For example, for the COAD, CRC and PAAD data sets, the average gains of predictive accuracy of JMAP are 0.019, 0.064 and 0.052 compared with gsslasso. Conclusion:The proposed method JMAP is a novel method that can provide more accurate phenotypic prediction while incorporating external useful group information.

show abstract

Likelihood Ratio Tests in Rare Variant Detection for Continuous Phenotypes

Cited by 23 publications

References 48 publications

The Weighting is the Hardest Part: On the Behavior of the Likelihood Ratio Test and the Score Test Under a Data-Driven Weighting Scheme in Sequenced Samples

The Weighting is the Hardest Part: On the Behavior of the Likelihood Ratio Test and the Score Test Under a Data-Driven Weighting Scheme in Sequenced Samples

Bootstrap Restricted Likelihood Ratio Test for the Detection of Rare Variants

Jackknife model averaging prediction methods for complex phenotypes with gene expression levels by integrating external pathway information

Contact Info

Product

Resources

About