We use a genome-wide association of 1 million parental lifespans of genotyped subjects and data on mortality risk factors to validate previously unreplicated findings near CDKN2B-AS1, ATXN2/BRAP, FURIN/FES, ZW10, PSORS1C3, and 13q21.31, and identify and replicate novel findings near ABO, ZC3HC1, and IGF2R. We also validate previous findings near 5q33.3/EBF1 and FOXO3, whilst finding contradictory evidence at other loci. Gene set and cell-specific analyses show that expression in foetal brain cells and adult dorsolateral prefrontal cortex is enriched for lifespan variation, as are gene pathways involving lipid proteins and homeostasis, vesicle-mediated transport, and synaptic function. Individual genetic variants that increase dementia, cardiovascular disease, and lung cancer – but not other cancers – explain the most variance. Resulting polygenic scores show a mean lifespan difference of around five years of life across the deciles.Editorial note: This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that all the issues have been addressed (see decision letter).
A general objective of genetic studies is to understand the genetic basis of complex traits such as height, body mass index (BMI), disease endpoints, etc. Such researches have been facilitated due to the completion of the human genome project and developments of high-throughput technologies. With the help of high-throughput genotyping and sequencing technologies, the information on millions of genetic markers can be measured for each individual. The most widely used strategy to detect the associations between genetic variants and a complex trait is genome-wide association study (GWAS). Because the genetic architecture of most complex traits is highly polygenic, the signal to noise ratio is usually tiny. Thus, especially in human populations, GWAS often requires large samples to obtain sufficient power. Unfortunately, given the restrictions on sharing individual-level data, it is often not feasible to pool data from different cohorts. Despite that, in each cohort, it is possible to report and share GWAS summary statistics, such as sample sizes, allele frequencies, estimates of genetic effect sizes, and their standard errors for the genetic markers across the genome. Therefore one recent focus in statistical methodology development for genetic studies has been on meta-analysis techniques using summary-level data. The objective of this thesis is to develop novel statistical genetics methods based on GWAS summary statistics and to apply these methods to better understand the genetic architecture underlying complex traits. In Study I, we developed a Selection Operator for JOint analyzing multiple SNPs (SOJO). We mathematically proved and empirically showed that the least absolute shrinkage and selection operator (LASSO) could be achieved using GWAS summary-level data. Compared to the stepwise selection procedures, SOJO performs better in variable selection. SOJO is useful for detecting additional variants with independent effects and assessing the magnitude of allelic heterogeneity within loci. In Study II, we developed a High-Definition Likelihood (HDL) method to improve the accuracy in genetic correlation estimation using GWAS summary statistics. Compared to the stateof-the-art method LD Score regression (LDSC), HDL achieves higher statistical power to detect genetic correlations between phenotypes by fully accounting for linkage disequilibrium (LD) information across the genome. In Study III, we introduced a four-level strategy for replication of loci detected by multi-trait GWAS methods. The four methods provide different degrees of replication strength, useful for providing additional evidence when a locus has been discovered and replicated by multivariate analysis of variance (MANOVA) or other multi-trait methods. The replication methods only require summary association statistics and are straightforward to be applied to multi-trait GWAS analyses. In Study IV, using GWAS summary statistics, we developed a method named Genetic Correlation Contrast for Causality (G3C) as a more robust test for the existence and di...
Joint modeling of a number of phenotypes using multivariate methods has often been neglected in genome-wide association studies and if used, replication has not been sought. Modern omics technologies allow characterization of functional phenomena using a large number of related phenotype measures, which can benefit from such joint analysis. Here, we report a multivariate genome-wide association studies of 23 immunoglobulin G (IgG) N-glycosylation phenotypes. In the discovery cohort, our multi-phenotype method uncovers ten genome-wide significant loci, of which five are novel (IGH, ELL2, HLA-B-C, AZI1, FUT6-FUT3). We convincingly replicate all novel loci via multivariate tests. We show that IgG N-glycosylation loci are strongly enriched for genes expressed in the immune system, in particular antibody-producing cells and B lymphocytes. We empirically demonstrate the efficacy of multivariate methods to discover novel, reproducible pleiotropic effects.
The E. coli protein, Fth, binds to 4.5S RNA through its M domain to form the signal recognition particle (SRP). The other domain of Fth (NG) is a GTPase, which binds and is coordinately regulated by its receptor, FtsY. We find that the helical M domain is inherently flexible. Binding of 4.5S RNA to Fth stabilizes the M domain yet has little apparent effect on the binding of signal peptides. However, in the absence of the RNA, signal peptide binding results in a global destabilization of Fth, which is prevented by binding of 4.5S RNA. Signal peptide binding to isolated NG domain also causes a pronounced destabilization, implicating the NG domain in direct recognition of signal peptide.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.