Protein-coding repeat polymorphisms strongly shape diverse human phenotypes

Mukamel, Ronen E.; Handsaker, Robert E.; Sherman, Maxwell A.; Barton, Alison R.; Zheng, Yiming; McCarroll, Steven A.; Loh, Po−Ru

doi:10.1126/science.abg8289

Cited by 127 publications

(204 citation statements)

References 100 publications

Supporting

Mentioning

194

Contrasting

Order By: Relevance

“…These limitations are not specific to meta-analysis fine-mapping, and separate fine-mapping methods based on bespoke imputation references have been developed ( e . g ., HLA 81 , KIR 82 , and variable numbers of tandem repeats [VNTR] 83 ).…”

Section: Discussionmentioning

confidence: 99%

“…On the other hand, we find it challenging to use a LD reference when true causal variants are located within a complex region (e.g., major histocompatibility complex [MHC]), or are entirely missing from standard LD or imputation reference panels, especially for structural variants. These limitations are not specific to meta-analysis fine-mapping, and separate fine-mapping methods based on bespoke imputation references have been developed (e.g., HLA 81 , KIR 82 , and variable numbers of tandem repeats [VNTR] 83 ).…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Meta-analysis fine-mapping is often miscalibrated at single-variant resolution

Kanai

Elzur

Zhou

et al. 2022

Preprint

View full text Add to dashboard Cite

Meta-analysis is pervasively used to combine multiple genome-wide association studies (GWAS) into a more powerful whole. To resolve causal variants, meta-analysis studies typically apply summary statistics-based fine-mapping methods as they are applied to single-cohort studies. However, it is unclear whether heterogeneous characteristics of each cohort (e.g., ancestry, sample size, phenotyping, genotyping, or imputation) affect fine-mapping calibration and recall. Here, we first demonstrate that meta-analysis fine-mapping is substantially miscalibrated in simulations when different genotyping arrays or imputation panels are included. To mitigate these issues, we propose a summary statistics-based QC method, SLALOM, that identifies suspicious loci for meta-analysis fine-mapping by detecting outliers in association statistics based on ancestry-matched local LD structure. Having validated SLALOM performance in simulations and the GWAS Catalog, we applied it to 14 disease endpoints from the Global Biobank Meta-analysis Initiative and found that 68% of loci showed suspicious patterns that call into question fine-mapping accuracy. These predicted suspicious loci were significantly depleted for having likely causal variants, such as nonsynonymous variants, as a lead variant (2.8x; Fisher's exact P = 6.2 × 10−4). Compared to fine-mapping results in individual biobanks, we found limited evidence of fine-mapping improvement in the GBMI meta-analyses. Although a full solution requires complete synchronization across cohorts, our approach identifies likely spurious results in meta-analysis fine-mapping. We urge extreme caution when interpreting fine-mapping results from meta-analysis.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Meta-analysis fine-mapping is often miscalibrated at single-variant resolution

Kanai

Elzur

Zhou

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Recent studies have suggested that variable number tandem repeats have differing effects on differing haplotypic backgrounds, although this has not been shown for HMOX1. 42 Thirdly, although we tested our imputation approach extensively, and are confident of the accuracy to within 2 repeats, it may be that the imputation is biased with respect to important outcomes. This may be particularly relevant if for example, a set of SNPs that are causal for the outcomes and are usually coincident with an increased repeat length are systematically imputed incorrectly.…”

Section: Comparison With Previous Literaturementioning

confidence: 99%

Phenotypic associations with the HMOX1 GT(n) promoter repeat in European populations

Hamilton

Mitchell

Ghazal

et al. 2022

Preprint

View full text Add to dashboard Cite

HO-1 is a key enzyme in the management of heme in humans. A GT(n) repeat length in the gene HMOX1, has previously been widely associated with a variety of phenotypes, including susceptibility and outcomes in diabetes, cancer, infections, and neonatal jaundice. However, studies are generally small and results inconsistent. In this study, we imputed the GT(n) repeat length in two European cohorts (UK Biobank, n = 463,005; and Avon Longitudinal Study of Parents and Children (ALSPAC n = 937), with the reliability of imputation tested in other cohorts (1000 Genomes, HGDP, and UK-PGP). Subsequently, we measured the relationship between repeat length and previously identified associations (diabetes, COPD, pneumonia and infection related mortality in UK Biobank; neonatal jaundice in ALSPAC) and performed a phenome-wide association study (PheWAS) in UK Biobank. Despite high quality imputation (correlation between true repeat length and imputed repeat length >0.9 in test cohorts), no clinical associations were identified in either the PheWAS or specific association studies. These findings were robust to definitions of repeat length and sensitivity analyses. Despite multiple smaller studies identifying associations across a variety of clinical settings; we could not replicate or identify any relevant phenotypic associations with the HMOX1 GT(n) repeat.

show abstract

“…Emerging technologies have recently revealed hundreds of thousands of genomic structural variants (SVs), including polymorphic duplications, deletions, inversions, and mobile transposable elements in the human genome ( Hurles et al 2008 ; Conrad et al 2010 ; Pang et al 2010 ; Mukamel et al 2021 ). Unlike single-nucleotide variants, each SV affects a continuous block in the genome and thus is more likely to result in a phenotypic effect ( Hurles et al 2008 ; Weischenfeldt et al 2013 ; Sudmant, Rausch, et al 2015 ).…”

Section: Introductionmentioning

confidence: 99%

“…Unlike single-nucleotide variants, each SV affects a continuous block in the genome and thus is more likely to result in a phenotypic effect ( Hurles et al 2008 ; Weischenfeldt et al 2013 ; Sudmant, Rausch, et al 2015 ). Several SVs have been documented to have considerable effects on human disease and evolution ( Dennis and Eichler 2016 ; Payer et al 2017 ; Hsieh et al 2019 ; Ho et al 2020 ; Mukamel et al 2021 ). Some of these functional variants reach >20% allele frequency in human populations, and some affect the copy number variation (CNV) of entire protein-coding genes ( McCarroll et al 2005 ; Handsaker et al 2015 ).…”

Section: Introductionmentioning

confidence: 99%

Similarity-Based Analysis of Allele Frequency Distribution among Multiple Populations Identifies Adaptive Genomic Structural Variants

Saitou

Masuda

Gökçümen

2021

Molecular Biology and Evolution

View full text Add to dashboard Cite

Structural variants have a considerable impact on human genomic diversity. However, their evolutionary history remains mostly unexplored. Here, we developed a new method to identify potentially adaptive structural variants based on a similarity-based analysis that incorporates genotype frequency data from 26 populations simultaneously. Using this method, we analyzed 57,629 structural variants and identified 576 structural variants that show unusual population differentiation. Of these putatively adaptive structural variants, we further showed that 24 variants are multiallelic and overlap with coding sequences, and 20 variants are significantly associated with GWAS traits. Closer inspection of the haplotypic variation associated with these putatively adaptive and functional structural variants reveals deviations from neutral expectations due to (i) population differentiation of rapidly evolving multi-allelic variants, (ii) incomplete sweeps, and (iii) recent population-specific negative selection. Overall, our study provides new methodological insights, documents hundreds of putatively adaptive variants, and introduces evolutionary models that may better explain the complex evolution of structural variants.

show abstract

Protein-coding repeat polymorphisms strongly shape diverse human phenotypes

Cited by 127 publications

References 100 publications

Meta-analysis fine-mapping is often miscalibrated at single-variant resolution

Meta-analysis fine-mapping is often miscalibrated at single-variant resolution

Phenotypic associations with the HMOX1 GT(n) promoter repeat in European populations

Similarity-Based Analysis of Allele Frequency Distribution among Multiple Populations Identifies Adaptive Genomic Structural Variants

Contact Info

Product

Resources

About