Genotype-to-phenotype mapping commonly focuses on two major classes of mutations: single nucleotide polymorphisms (SNPs) and copy number variation (CNV). Here, we discuss an underestimated third class of genotypic variation: changes in microsatellite and minisatellite repeats. Such tandem repeats (TRs) are ubiquitous, unstable genomic elements that have historically been designated as nonfunctional "junk DNA" and are therefore mostly ignored in comparative genomics. However, as many as 10% to 20% of eukaryotic genes and promoters contain an unstable repeat tract. Mutations in these repeats often have fascinating phenotypic consequences. For example, changes in unstable repeats located in or near human genes can lead to neurodegenerative diseases such as Huntington disease. Apart from their role in disease, variable repeats also confer useful phenotypic variability, including cell surface variability, plasticity in skeletal morphology, and tuning of the circadian rhythm. As such, TRs combine characteristics of genetic and epigenetic changes that may facilitate organismal evolvability.
Copy Number Variations (CNVs) and Single Nucleotide Polymorphisms (SNPs) have been the major focus of most large-scale comparative genomics studies to date. Here, we discuss a third, largely ignored, type of genetic variation, namely changes in tandem repeat number. Historically, tandem repeats have been designated as non functional “junk” DNA, mostly as a result of their highly unstable nature. With the exception of tandem repeats involved in human neurodegenerative diseases, repeat variation was often believed to be neutral with no phenotypic consequences. Recent studies, however, have shown that as many as 10% to 20% of coding and regulatory sequences in eukaryotes contain an unstable repeat tract. Contrary to initial suggestions, tandem repeat variation can have useful phenotypic consequences. Examples include rapid variation in microbial cell surface, tuning of internal molecular clocks in flies and the dynamic morphological plasticity in mammals. As such, tandem repeats can be useful functional elements that facilitate evolvability and rapid adaptation.
SummaryExcessive expansions of glutamine (Q)-rich repeats in various human proteins are known to result in severe neurodegenerative disorders such as Huntington’s disease and several ataxias. However, the physiological role of these repeats and the consequences of more moderate repeat variation remain unknown. Here, we demonstrate that Q-rich domains are highly enriched in eukaryotic transcription factors where they act as functional modulators. Incremental changes in the number of repeats in the yeast transcriptional regulator Ssn6 (Cyc8) result in systematic, repeat-length-dependent variation in expression of target genes that result in direct phenotypic changes. The function of Ssn6 increases with its repeat number until a certain threshold where further expansion leads to aggregation. Quantitative proteomic analysis reveals that the Ssn6 repeats affect its solubility and interactions with Tup1 and other regulators. Thus, Q-rich repeats are dynamic functional domains that modulate a regulator’s innate function, with the inherent risk of pathogenic repeat expansions.
Tandem repeats are short DNA sequences that are repeated head-to-tail with a propensity to be variable. They constitute a significant proportion of the human genome, also occurring within coding and regulatory regions. Variation in these repeats can alter the function and/or expression of genes allowing organisms to swiftly adapt to novel environments. Importantly, some repeat expansions have also been linked to certain neurodegenerative diseases. Therefore, accurate sequencing of tandem repeats could contribute to our understanding of common phenotypic variability and might uncover missing genetic factors in idiopathic clinical conditions. However, despite long-standing evidence for the functional role of repeats, they are largely ignored because of technical limitations in sequencing, mapping and typing. Here, we report on a novel capture technique and data filtering protocol that allowed simultaneous sequencing of thousands of tandem repeats in the human genomes of a three generation family using GS-FLX-plus Titanium technology. Our results demonstrated that up to 7.6% of tandem repeats in this family (4% in coding sequences) differ from the reference sequence, and identified a de novo variation in the family tree. The method opens new routes to look at this underappreciated type of genetic variability, including the identification of novel disease-related repeats.
Proteins with amino acid homorepeats have the potential to be detrimental to cells and are often associated with human diseases. Why, then, are homorepeats prevalent in eukaryotic proteomes? In yeast, homorepeats are enriched in proteins that are essential and pleiotropic and that buffer environmental insults. The presence of homorepeats increases the functional versatility of proteins by mediating protein interactions and facilitating spatial organization in a repeat-dependent manner. During evolution, homorepeats are preferentially retained in proteins with stringent proteostasis, which might minimize repeat-associated detrimental effects such as unregulated phase separation and protein aggregation. Their presence facilitates rapid protein divergence through accumulation of amino acid substitutions, which often affect linear motifs and post-translational-modification sites. These substitutions may result in rewiring protein interaction and signaling networks. Thus, homorepeats are distinct modules that are often retained in stringently regulated proteins. Their presence facilitates rapid exploration of the genotype-phenotype landscape of a population, thereby contributing to adaptation and fitness.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.