Hypothesis: The prevalence of type 2 diabetes is higher in Latino populations compared with other major ancestry groups. Not only has the Latino population been systematically underrepresented in large-scale genetic analyses, but previous studies relied on the imputation of ungenotyped variants based on the 1000 Genomes (1000G) imputation reference panel, which results in suboptimal capture of low-frequency or Latino-enriched variants. The NHLBI Trans-Omics for Precision Medicine (TOPMed) reference panel represents a unique opportunity to analyze rare genetic variations in the Latino population.
Methods: We evaluate the TOPMed imputation performance using genotyping array and whole-exome sequence data in 6 Latino cohorts. To evaluate the ability of TOPMed imputation of increasing the identified loci, we performed a Latino type 2 diabetes GWAS meta-analysis in 8,150 type 2 diabetes cases and 10,735 controls and replicated the results in 6 additional cohorts including whole-genome sequence data from the All of Us cohort.
Results: We show that, compared to imputation with 1000G, the TOPMed panel improves the identification of rare and low-frequency variants. We identified 26 distinct signals including a novel genome-wide significant variant (minor allele frequency 1.6%, OR=2.0, P=3.4−10-9) near ORC5. A Latino-tailored polygenic score constructed from our data and GWAS data from East Asian and European populations improves the prediction accuracy in a Latino target dataset, explaining up to 7.6% of the type 2 diabetes risk variance.
Conclusions: Our results demonstrate the utility of TOPMed imputation for identifying low-frequency variation in understudied populations, leading to the discovery of novel disease associations and the improvement of polygenic scores.