Enhancer elements in the human genome control how genes are expressed in specific cell types and harbor thousands of genetic variants that influence risk for common diseases [1][2][3][4] . Yet, we still do not know how enhancers regulate specific genes, and we lack general rules to predict enhancer-Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:
Age is the dominant risk factor for most chronic human diseases; yet the mechanisms by which aging confers this risk are largely unknown.
1
Recently, the age-related acquisition of somatic mutations in regenerating hematopoietic stem cell populations leading to clonal expansion was associated with both hematologic cancer
2
–
4
and coronary heart disease
5
, a phenomenon termed ‘Clonal Hematopoiesis of Indeterminate Potential’ (CHIP).
6
Simultaneous germline and somatic whole genome sequence analysis now provides the opportunity to identify root causes of CHIP. Here, we analyze high-coverage whole genome sequences from 97,691 participants of diverse ancestries in the NHLBI TOPMed program and identify 4,229 individuals with CHIP. We identify associations with blood cell, lipid, and inflammatory traits specific to different CHIP genes. Association of a genome-wide set of germline genetic variants identified three genetic loci associated with CHIP status, including one locus at
TET2
that was African ancestry specific.
In silico
-informed
in vitro
evaluation of the
TET2
germline locus identified a causal variant that disrupts a
TET2
distal enhancer resulting in increased hematopoietic stem cell self-renewal. Overall, we observe that germline genetic variation shapes hematopoietic stem cell function leading to CHIP through mechanisms that are both specific to clonal hematopoiesis and shared mechanisms leading to somatic mutations across tissues.
Genome-wide association studies (GWAS) are a valuable tool for understanding the biology of complex traits, but the associations found rarely point directly to causal genes. Here, we introduce a new method to identify the causal genes by integrating GWAS summary statistics with gene expression, biological pathway, and predicted protein-protein interaction data. We further propose an approach that effectively leverages both polygenic and locus-specific genetic signals by combining results across multiple gene prioritization methods, increasing confidence in prioritized genes. Using a large set of gold standard genes to evaluate our approach, we prioritize 8,402 unique gene-trait pairs with greater than 75% estimated precision across 113 complex traits and diseases, including known genes such as SORT1 for LDL cholesterol, SMIM1 for red blood cell count, and DRD2 for schizophrenia, as well as novel genes such as TTC39B for cholelithiasis. Our results demonstrate that a polygenic approach is a powerful tool for gene prioritization and, in combination with locus-specific signal, improves upon existing methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.