Homopolymeric amino acid repeats (AARs) like polyalanine (polyA) and polyglutamine (polyQ) in some developmental proteins (DPs) regulate certain aspects of organismal morphology and behavior, suggesting an evolutionary role for AARs as developmental “tuning knobs.” It is still unclear, however, whether these are occasional protein-specific phenomena or hints at the existence of a whole AAR-based regulatory system in DPs. Using novel approaches to trace their functional and evolutionary history, we find quantitative evidence supporting a generalized, combinatorial role of AARs in developmental processes with evolutionary implications. We observe nonrandom AAR distributions and combinations in HOX and other DPs, as well as in their interactomes, defining elements of a proteome-wide combinatorial functional code whereby different AARs and their combinations appear preferentially in proteins involved in the development of specific organs/systems. Such functional associations can be either static or display detectable evolutionary dynamics. These findings suggest that progressive changes in AAR occurrence/combination, by altering embryonic development, may have contributed to taxonomic divergence, leaving detectable traces in the evolutionary history of proteomes. Consistent with this hypothesis, we find that the evolutionary trajectories of the 20 AARs in eukaryotic proteomes are highly interrelated and their individual or compound dynamics can sharply mark taxonomic boundaries, or display clock-like trends, carrying overall a strong phylogenetic signal. These findings provide quantitative evidence and an interpretive framework outlining a combinatorial system of AARs whose compound dynamics mark at the same time DP functions and evolutionary transitions.
Intermolecular co-evolution optimizes physiological performance in functionally related proteins, ultimately increasing molecular co-adaptation and evolutionary fitness. Polyglutamine (polyQ) repeats, which are over-represented in nervous system-related proteins, are increasingly recognized as length-dependent regulators of protein function and interactions, and their length variation contributes to intraspecific phenotypic variability and interspecific divergence. However, it is unclear whether polyQ repeat lengths evolve independently in each protein or rather co-evolve across functionally related protein pairs and networks, as in an integrated regulatory system. To address this issue, we investigated here the length evolution and co-evolution of polyQ repeats in clusters of functionally related and physically interacting neural proteins in Primates. We observed function-/disease-related polyQ repeat enrichment and evolutionary hypervariability in specific neural protein clusters, particularly in the neurocognitive and neuropsychiatric domains. Notably, these analyses detected extensive patterns of intermolecular polyQ length co-evolution in pairs and clusters of functionally related, physically interacting proteins. Moreover, they revealed both direct and inverse polyQ length co-variation in protein pairs, together with complex patterns of coordinated repeat variation in entire polyQ protein sets. These findings uncover a whole system of co-evolving polyQ repeats in neural proteins with direct implications for understanding polyQ-dependent phenotypic variability, neurocognitive evolution and neuropsychiatric disease pathogenesis.
The fusion of the SARS-CoV-2 virus with cells, a key event in the pathogenesis of Covid-19, depends on the assembly of a six-helix fusion core (FC) formed by portions of the spike protein heptad repeats (HR) 1 and 2. Despite the critical role in regulating infectivity, its distinctive features, origin, and evolution are scarcely understood. Thus, we undertook a structure-guided positional and compositional analysis of the SARS-CoV-2 FC, in comparison with FCs of related viruses, tracing its origin and ongoing evolution. We find that clustered amino acid substitutions within HR1, distinguishing SARS-CoV-2 from SARS-CoV-1, enhance local heptad stereotypy and increase sharply the FC serine-to-glutamine (S/Q) ratio, determining a neat alternate layering of S-rich and Q-rich subdomains along the post-fusion structure. Strikingly, SARS-CoV-2 ranks among viruses with the highest FC S/Q ratio, together with highly syncytiogenic respiratory pathogens (RSV, NDV), whereas MERS-Cov, HIV, and Ebola viruses display low ratios, and this feature reflects onto S/Q segregation and H-bonding patterns. Our evolutionary analyses reveal that the SARS-CoV-2 FC occurs in other SARS-CoV-1-like Sarbecoviruses identified since 2005 in Hong Kong and adjacent regions, tracing its origin to >50 years ago with a recombination-driven spread. Finally, current mutational trends show that the FC is varying especially in the FC1 evolutionary hotspot. These findings establish a novel analytical framework illuminating the sequence/structure evolution of the SARS-CoV-2 FC, tracing its long history within Sarbecoviruses, and may help rationalize the evolution of the fusion machinery in emerging pathogens and the design of novel therapeutic fusion inhibitors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.