One of the most striking patterns of genome structure is the tight, typically negative, association between transposable elements (TEs) and meiotic recombination rates. While this is a highly recurring feature of eukaryotic genomes, the mechanisms driving correlations between TEs and recombination remain poorly understood, and distinguishing cause versus effect is challenging. Here, we review the evidence for a relation between TEs and recombination, and discuss the underlying evolutionary forces. Evidence to date suggests that overall TE densities correlate negatively with recombination, but the strength of this correlation varies across element types, and the pattern can be reversed. Results suggest that heterogeneity in the strength of selection against ectopic recombination and gene disruption can drive TE accumulation in regions of low recombination, but there is also strong evidence that the regulation of TEs can influence local recombination rates. We hypothesize that TE insertion polymorphism may be important in driving within-species variation in recombination rates in surrounding genomic regions. Furthermore, the interaction between TEs and recombination may create positive feedback, whereby TE accumulation in non-recombining regions contributes to the spread of recombination suppression. Further investigation of the coevolution between recombination and TEs has important implications for our understanding of the evolution of recombination rates and genome structure.This article is part of the themed issue 'Evolutionary causes and consequences of recombination rate variation in sexual organisms'.
SummaryBiobanks are being established across the world to understand the genetic, environmental, and epidemiological basis of human diseases with the goal of better prevention and treatments. Genome-wide association studies (GWAS) have been very successful at mapping genomic loci for a wide range of human diseases and traits, but in general, lack appropriate representation of diverse ancestries - with most biobanks and preceding GWAS studies composed of individuals of European ancestries. Here, we introduce the Global Biobank Meta-analysis Initiative (GBMI) -- a collaborative network of 19 biobanks from 4 continents representing more than 2.1 million consented individuals with genetic data linked to electronic health records. GBMI meta-analyzes summary statistics from GWAS generated using harmonized genotypes and phenotypes from member biobanks. GBMI brings together results from GWAS analysis across 6 main ancestry groups: approximately 33,000 of African ancestry either from Africa or from admixed-ancestry diaspora (AFR), 18,000 admixed American (AMR), 31,000 Central and South Asian (CSA), 341,000 East Asian (EAS), 1.4 million European (EUR), and 1,600 Middle Eastern (MID) individuals. In this flagship project, we generated GWASs from across 14 exemplar diseases and endpoints, including both common and less prevalent diseases that were previously understudied. Using the genetic association results, we validate that GWASs conducted in biobanks worldwide can be successfully integrated despite heterogeneity in case definitions, recruitment strategies, and baseline characteristics between biobanks. We demonstrate the value of this collaborative effort to improve GWAS power for diseases, increase representation, benefit understudied diseases, and improve risk prediction while also enabling the nomination of disease genes and drug candidates by incorporating gene and protein expression data and providing insight into the underlying biology of the studied traits.
SummaryWith the increasing availability of biobank-scale datasets that incorporate both genomic data and electronic health records, many associations between genetic variants and phenotypes of interest have been discovered. Polygenic risk scores (PRS), which are being widely explored in precision medicine, use the results of association studies to predict the genetic component of disease risk by accumulating risk alleles weighted by their effect sizes. However, limited studies have thoroughly investigated best practices for PRS in global populations across different diseases. In this study, we utilize data from the Global-Biobank Meta-analysis Initiative (GBMI), which consists of individuals from diverse ancestries and across continents, to explore methodological considerations and PRS prediction performance in 9 different biobanks for 14 disease endpoints. Specifically, we constructed PRS using heuristic (pruning and thresholding, P+T) and Bayesian (PRS-CS) methods. We found that the genetic architecture, such as SNP-based heritability and polygenicity, varied greatly among endpoints. For both PRS construction methods, using a European ancestry LD reference panel resulted in comparable or higher prediction accuracy compared to several other non-European based panels; this is largely attributable to European descent populations still comprising the majority of GBMI participants. PRS-CS overall outperformed the classic P+T method, especially for endpoints with higher SNP-based heritability. For example, substantial improvements are observed in East-Asian ancestry (EAS) using PRS-CS compared to P+T for heart failure (HF) and chronic obstructive pulmonary disease (COPD). Notably, prediction accuracy is heterogeneous across endpoints, biobanks, and ancestries, especially for asthma which has known variation in disease prevalence across global populations. Overall, we provide lessons for PRS construction, evaluation, and interpretation using the GBMI and highlight the importance of best practices for PRS in the biobank-scale genomics era.
Transposable elements (TEs) make up a significant portion of eukaryotic genomes and are important drivers of genome evolution. However, the extent to which TEs affect gene expression variation on a genome-wide scale in comparison with other types of variants is still unclear. We characterized TE insertion polymorphisms and their association with gene expression in 124 whole-genome sequences from a single population of Capsella grandiflora, and contrasted this with the effects of single nucleotide polymorphisms (SNPs). Population frequency of insertions was negatively correlated with distance to genes, as well as density of conserved noncoding elements, suggesting that the negative effects of TEs on gene regulation are important in limiting their abundance. Rare TE variants strongly influence gene expression variation, predominantly through downregulation. In contrast, rare SNPs contribute equally to up- and down-regulation, but have a weaker individual effect than TEs. An expression quantitative trait loci (eQTL) analysis shows that a greater proportion of common TEs are eQTLs as opposed to common SNPs, and a third of the genes with TE eQTLs do not have SNP eQTLs. In contrast with rare TE insertions, common insertions are more likely to increase expression, consistent with recent models of cis-regulatory evolution favoring enhancer alleles. Taken together, these results imply that TEs are a significant contributor to gene expression variation and are individually more likely than rare SNPs to cause extreme changes in gene expression.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.