The excision of introns from pre-mRNA is an essential step in mRNA processing. We developed LeafCutter to study sample and population variation in intron splicing. LeafCutter identifies variable splicing events from short-read RNA-seq data and finds events of high complexity. Our approach obviates the need for transcript annotations and circumvents the challenges in estimating relative isoform or exon usage in complex splicing events. LeafCutter can be used both for detecting differential splicing between sample groups, and for mapping splicing quantitative trait loci (sQTLs). Compared to contemporary methods, we find 1.4–2.1 times more sQTLs, many of which help us ascribe molecular effects to disease-associated variants. Strikingly, transcriptome-wide associations between LeafCutter intron quantifications and 40 complex traits increased the number of associated disease genes at 5% FDR by an average of 2.1-fold as compared to using gene expression levels alone. LeafCutter is fast, scalable, easy to use, and available online.
Scalable, integrative methods to understand mechanisms that link genetic variants with phenotypes are needed. Here we derive a mathematical expression to compute PrediXcan (a gene mapping approach) results using summary data (S-PrediXcan) and show its accuracy and general robustness to misspecified reference sets. We apply this framework to 44 GTEx tissues and 100+ phenotypes from GWAS and meta-analysis studies, creating a growing public catalog of associations that seeks to capture the effects of gene expression variation on human phenotypes. Replication in an independent cohort is shown. Most of the associations are tissue specific, suggesting context specificity of the trait etiology. Colocalized significant associations in unexpected tissues underscore the need for an agnostic scanning of multiple contexts to improve our ability to detect causal regulatory mechanisms. Monogenic disease genes are enriched among significant associations for related traits, suggesting that smaller alterations of these genes may cause a spectrum of milder phenotypes.
14To understand the biological mechanisms underlying the thousands of genetic variants robustly associated with 15 complex traits, scalable methods that integrate GWAS and functional data generated by large-scale efforts are 16 needed. We derived a mathematical expression to compute PrediXcan results using summary data (S-17 PrediXcan) and showed its accuracy and robustness to misspecified reference populations. We compared S-18PrediXcan with existing methods and combined them into a best practice framework (MetaXcan) that 19integrates GWAS with QTL studies and reduces LD-confounded associations. We applied this framework to 44 20 GTEx tissues and 101 phenotypes from GWAS and meta-analysis studies, creating a growing catalog of 21 associations that captures the effects of gene expression variation on human phenotypes. Most of the 22 associations were tissue specific, indicating context specificity of the trait etiology. Colocalized significant 23 associations in unexpected tissues underscore the advantages of an agnostic scanning of multiple contexts to 24 increase the probability of detecting causal regulatory mechanisms. 25Prediction models, efficient software implementation, and association results are shared as a resource for 26 the research community.
Both postprandial hyperglycemia and insulin resistance (IR) have implications for the development of cardiovascular disease. The present study was designed to examine differences in postprandial glycemia and insulin sensitivity among young adults of different ethnic origins. Lean, healthy subjects (n = 60) from five ethnic groups [20 European Caucasians, 10 Chinese, 10 South East (SE) Asians, 10 Asian Indians and 10 Arabic Caucasians] were matched for age, body mass index, waist circumference, birth weight and current diet. A 75-g white bread carbohydrate challenge was fed to assess postprandial glycemia and insulinemia. Insulin sensitivity was assessed in three groups by the euglycemic-hyperinsulinemic clamp and in all subjects by homeostasis model assessment (HOMA) modeling. Postprandial hyperglycemia (incremental area under the curve) and insulin sensitivity (M-value) both showed a twofold variation among the groups (P < 0.001) and were significantly related to each other (R(2) = 56%, P < 0.001). Young SE Asians had the highest postprandial glycemia and lowest insulin sensitivity, whereas European and Arabic Caucasian subjects were the most insulin sensitive and carbohydrate tolerant. These findings suggest that IR is evident even in lean, young adults of some ethnic groups and is associated with significant increases in postprandial glycemia and insulinemia in response to a realistic carbohydrate load.
For many complex traits, gene regulation is likely to play a crucial mechanistic role. How the genetic architectures of complex traits vary between populations and subsequent effects on genetic prediction are not well understood, in part due to the historical paucity of GWAS in populations of non-European ancestry. We used data from the MESA (Multi-Ethnic Study of Atherosclerosis) cohort to characterize the genetic architecture of gene expression within and between diverse populations. Genotype and monocyte gene expression were available in individuals with African American (AFA, n = 233), Hispanic (HIS, n = 352), and European (CAU, n = 578) ancestry. We performed expression quantitative trait loci (eQTL) mapping in each population and show genetic correlation of gene expression depends on shared ancestry proportions. Using elastic net modeling with cross validation to optimize genotypic predictors of gene expression in each population, we show the genetic architecture of gene expression for most predictable genes is sparse. We found the best predicted gene in each population, TACSTD2 in AFA and CHURC1 in CAU and HIS, had similar prediction performance across populations with R2 > 0.8 in each population. However, we identified a subset of genes that are well-predicted in one population, but poorly predicted in another. We show these differences in predictive performance are due to allele frequency differences between populations. Using genotype weights trained in MESA to predict gene expression in independent populations showed that a training set with ancestry similar to the test set is better at predicting gene expression in test populations, demonstrating an urgent need for diverse population sampling in genomics. Our predictive models and performance statistics in diverse cohorts are made publicly available for use in transcriptome mapping methods at https://github.com/WheelerLab/DivPop.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.