Significance testing one SNP at a time has proven useful for identifying genomic regions that harbor variants affecting human disease. But after an initial genome scan has identified a “hit region” of association, single-locus approaches can falter. Local linkage disequilibrium (LD) can make both the number of underlying true signals and their identities ambiguous. Simultaneous modeling of multiple loci should help. However, it is typically applied ad hoc: conditioning on the top SNPs, with limited exploration of the model space and no assessment of how sensitive model choice was to sampling variability. Formal alternatives exist but are seldom used. Bayesian variable selection is coherent but requires specifying a full joint model, including priors on parameters and the model space. Penalized regression methods (e.g., LASSO) appear promising but require calibration, and, once calibrated, lead to a choice of SNPs that can be misleadingly decisive. We present a general method for characterizing uncertainty in model choice that is tailored to reprioritizing SNPs within a hit region under strong LD. Our method, LASSO local automatic regularization resample model averaging (LLARRMA), combines LASSO shrinkage with resample model averaging and multiple imputation, estimating for each SNP the probability that it would be included in a multi-SNP model in alternative realizations of the data. We apply LLARRMA to simulations based on case-control genome-wide association studies data, and find that when there are several causal loci and strong LD, LLARRMA identifies a set of candidates that is enriched for true signals relative to single locus analysis and to the recently proposed method of Stability Selection. Genet. Epidemiol. 36:451–462, 2012. © 2012 Wiley Periodicals, Inc.
Summary We describe a simple, computationally effcient, permutation-based procedure for selecting the penalty parameter in LASSO penalized regression. The procedure, permutation selection, is intended for applications where variable selection is the primary focus, and can be applied in a variety of structural settings, including that of generalized linear models. We briefly discuss connections between permutation selection and existing theory for the LASSO. In addition, we present a simulation study and an analysis of real biomedical data sets in which permutation selection is compared with selection based on the following: cross-validation (CV), the Bayesian information criterion (BIC), Scaled Sparse Linear Regression, and a selection method based on recently developed testing procedures for the LASSO.
Lipoprotein (a) (Lp(a)) is an independent risk factor for cardiovascular disease. Lp(a) levels in African Americans (AAs) are much higher compared with that in European Americans. We conducted a genome- and an exome-wide association study of Lp(a) among 2895 AAs participating in the Jackson Heart Study. We observed that local ancestry at 6q25.3 was an important risk factor for Lp(a) in AAs, and that multiple single-nucleotide polymorphisms (SNPs) at the well-established LPA locus were significantly associated with Lp(a) (P<5 × 10(-8)) after adjusting for the local ancestry at 6q25.3. Interestingly, before adjusting for local ancestry, we observed significant (P<5 × 10(-8)) associations for hundreds of SNPs spanning ~10 Mb region on 6q surrounding the LPA gene, whereas after adjusting for local ancestry, the region containing significantly associated SNPs got much narrower and was centered over the LPA gene (<1 Mb). We observed a single nonsynonymous SNP in APOE significantly associated with Lp(a) (P<5 × 10(-8)). A high burden of coding variants in LPA and APOE were also associated with higher Lp(a) levels. Our study provides evidence that ancestry-specific causal risk variant(s) resides in or near LPA and that most of the observed associations outside this narrower region are spurious associations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.