Recent evidence demonstrates that novel protein-coding genes can arise de novo from nongenic loci. This evolutionary innovation is thought to be facilitated by the pervasive translation of non-genic transcripts, which exposes a reservoir of variable polypeptides to natural selection. Here, we systematically characterize how these de novo emerging coding sequences impact fitness in budding yeast. Disruption of emerging sequences is generally inconsequential for fitness in the laboratory and in natural populations. Overexpression of emerging sequences, however, is enriched in adaptive fitness effects compared to overexpression of established genes. We find that adaptive emerging sequences tend to encode putative transmembrane domains, and that thymine-rich intergenic regions harbor a widespread potential to produce transmembrane domains. These findings, together with in-depth examination of the de novo emerging YBR196C-A locus, suggest a novel evolutionary model whereby adaptive transmembrane polypeptides emerge de novo from thymine-rich nongenic regions and subsequently accumulate changes molded by natural selection.
Ribosome profiling experiments demonstrate widespread translation of eukaryotic genomes outside of annotated protein-coding genes. However, it is unclear how much of this "noncanonical" translation contributes biologically relevant microproteins rather than insignificant translational noise. Here, we developed an integrative computational framework (iRibo) that leverages hundreds of ribosome profiling experiments to detect signatures of translation with high sensitivity and specificity. We deployed iRibo to construct a reference translatome in the model organism S. cerevisiae. We identified ~19,000 noncanonical translated elements outside of the ~5,400 canonical yeast protein-coding genes. Most (65%) of these non-canonical translated elements were located on transcripts annotated as non-coding, or entirely unannotated, while the remainder were located on the 5' and 3' ends of mRNA transcripts. Only 14 non-canonical translated elements were evolutionarily conserved. In stark contrast with canonical protein-coding genes, the great majority of the yeast noncanonical translatome appeared evolutionarily transient and showed no signatures of selection. Yet, we uncovered phenotypes for 53% of a representative subset of evolutionarily transient translated elements. The iRibo framework and reference translatome described here provide a foundation for further investigation of a largely unexplored, but biologically significant, evolutionarily transient translatome.
Dihydrofolate reductase (DHFR) is a ubiquitous enzyme with an essential role in cell metabolism. DHFR catalyzes the reduction of dihydrofolate to tetrahydrofolate, which is a precursor for purine and thymidylate synthesis. Several DHFR targeting antifolate drugs including trimethoprim, a competitive antibacterial inhibitor, have therefore been developed and are clinically used. Evolution of resistance against antifolates is a common public health problem rendering these drugs ineffective. To combat the resistance problem, it is important to understand resistance-conferring changes in the DHFR structure and accordingly develop alternative strategies. Here, we structurally and dynamically characterize Escherichia coli DHFR in its wild type (WT) and trimethoprim resistant L28R mutant forms in the presence of the substrate and its inhibitor trimethoprim. We use molecular dynamics simulations to determine the conformational space, loop dynamics and hydrogen bond distributions at the active site of DHFR for the WT and the L28R mutant. We also report their experimental kcat, Km, and Ki values, accompanied by isothermal titration calorimetry measurements of DHFR that distinguish enthalpic and entropic contributions to trimethoprim binding. Although mutations that confer resistance to competitive inhibitors typically make enzymes more promiscuous and decrease affinity to both the substrate and the inhibitor, strikingly, we find that the L28R mutant has a unique resistance mechanism. While the binding affinity differences between the WT and the mutant for the inhibitor and the substrate are small, the newly formed extra hydrogen bonds with the aminobenzoyl glutamate tail of DHF in the L28R mutant leads to increased barriers for the dissociation of the substrate and the product. Therefore, the L28R mutant indirectly gains resistance by enjoying prolonged binding times in the enzyme-substrate complex. While this also leads to slower product release and decreases the catalytic rate of the L28R mutant, the overall effect is the maintenance of a sufficient product formation rate. Finally, the experimental and computational analyses together reveal the changes that occur in the energetic landscape of DHFR upon the resistance-conferring L28R mutation. We show that the negative entropy associated with the binding of trimethoprim in WT DHFR is due to water organization at the binding interface. Our study lays the framework to study structural changes in other trimethoprim resistant DHFR mutants.
Recent evidence demonstrates that novel protein-coding genes can arise de novo from intergenic loci. This evolutionary innovation is thought to be facilitated by the pervasive translation of intergenic transcripts, which exposes a reservoir of variable polypeptides to natural selection. Do intergenic translation events yield polypeptides with useful biochemical capacities?The answer to this question remains controversial. Here, we systematically characterized how de novo emerging coding sequences impact fitness. In budding yeast, overexpression of these sequences was enriched in beneficial effects, while their disruption was generally inconsequential.We found that beneficial emerging sequences have a strong tendency to encode putative transmembrane proteins, which appears to stem from a cryptic propensity for transmembrane signals throughout thymine-rich intergenic regions of the genome. These findings suggest that novel genes with useful biochemical capacities, such as transmembrane domains, tend to evolve de novo within intergenic loci that already harbored a blueprint for these capacities. 3 The molecular mechanisms and dynamics of de novo gene birth are poorly understood 1 . It is particularly unclear how non-genic sequences could spontaneously encode proteins with specific and useful biochemical capacities. To resolve this paradox, it has been proposed that pervasive translation of non-genic transcripts can expose genetic variation, in the form of novel polypeptides, to natural selection, thereby purging toxic sequences and providing adaptive potential to the 5 organism 2,3 . The genomic sequences encoding these novel polypeptides have been called "protogenes", to denote that they correspond to a distinct class of genetic elements that are intermediates between non-genic sequences and established genes 3 . In agreement, several studies reported that de novo emerging coding sequences tend to display lengths, transcript architectures, transcription levels, strength of purifying selection, sequence compositions, structural features and integration 10 in cellular networks that are intermediate between those observed in non-genic sequences and those observed in established genes 3-8 . Furthermore, pervasive translation of non-genic sequences has been observed repeatedly by ribosome profiling and proteo-genomics 3,[9][10][11][12] , and studies have shown that random sequence libraries harbor bioactive effects [13][14][15][16][17] . Nonetheless, whether and how native proto-genes carry adaptive potential remains unknown. 15 We sought to formalize the predictions of adaptive proto-gene evolution. We define adaptive potential as the capacity to increase fitness by means of evolutionary change. While any sequence may in theory carry adaptive potential, changes in established genes are typically constrained by preexisting selected effects -the specific physiological processes mediated by the gene products that lead to their evolutionary conservation 18 . In contrast, emerging proto-genes are 20 expected to mostly lack ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.