Type 2 diabetes (T2D) is a heterogeneous disease that develops through diverse pathophysiological processes. To characterise the genetic contribution to these processes across ancestry groups, we aggregate genome-wide association study (GWAS) data from 2,535,601 individuals (39.7% non-European ancestry), including 428,452 T2D cases. We identify 1,289 independent association signals at genome-wide significance (P<5x10-8) that map to 611 loci, of which 145 loci are previously unreported. We define eight non-overlapping clusters of T2D signals characterised by distinct profiles of cardiometabolic trait associations. These clusters are differentially enriched for cell-type specific regions of open chromatin, including pancreatic islets, adipocytes, endothelial, and enteroendocrine cells. We build cluster-specific partitioned genetic risk scores (GRS) in an additional 137,559 individuals of diverse ancestry, including 10,159 T2D cases, and test their association with T2D-related vascular outcomes. Cluster-specific partitioned GRS are more strongly associated with coronary artery disease and end-stage diabetic nephropathy than an overall T2D GRS across ancestry groups, highlighting the importance of obesity-related processes in the development of vascular outcomes. Our findings demonstrate the value of integrating multi-ancestry GWAS with single-cell epigenomics to disentangle the aetiological heterogeneity driving the development and progression of T2D, which may offer a route to optimise global access to genetically-informed diabetes care.
Genetic association studies have identified hundreds of independent signals associated with type 2 diabetes (T2D) and related traits. Despite these successes, the identification of specific causal variants underlying a genetic association signal remains challenging. In this study, we describe a deep learning method to analyze the impact of sequence variants on enhancers. Focusing on pancreatic islets, a T2D relevant tissue, we show that our model learns islet-specific transcription factor (TF) regulatory patterns and can be used to prioritize candidate causal variants. At 101 genetic signals associated with T2D and related glycemic traits where multiple variants occur in linkage disequilibrium, our method nominates a single causal variant for each association signal, including three variants previously shown to alter reporter activity in islet-relevant cell types. For another signal associated with blood glucose levels, we biochemically test all candidate causal variants from statistical fine-mapping using a pancreatic islet beta cell line and show biochemical evidence of allelic effects on TF binding for the model-prioritized variant. To aid in future research, we publicly distribute our model and islet enhancer perturbation scores across ∼67 million genetic variants. We anticipate that deep learning methods like the one presented in this study will enhance the prioritization of candidate causal variants for functional studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.