Structural Genomics aims to elucidate protein structures to identify their functions. Unfortunately, the variation of just a few residues can be enough to alter activity or binding specificity and limit the functional resolution of annotations based on sequence and structure; in enzymes, substrates are especially difficult to predict. Here, large-scale controls and direct experiments show that the local similarity of five or six residues selected because they are evolutionarily important and on the protein surface can suffice to identify an enzyme activity and substrate. A motif of five residues predicted that a previously uncharacterized Silicibacter sp. protein was a carboxylesterase for short fatty acyl chains, similar to hormone-sensitive-lipase-like proteins that share less than 20% sequence identity. Assays and directed mutations confirmed this activity and showed that the motif was essential for catalysis and substrate specificity. We conclude that evolutionary and structural information may be combined on a Structural Genomics scale to create motifs of mixed catalytic and noncatalytic residues that identify enzyme activity and substrate specificity. A s the list of known genes grows exponentially, the elucidation of their function remains a major bottleneck and lags far behind the production of sequences (1-5). The best approach remains to search computationally for functionally characterized sequence homologs, ideally with greater than 50% sequence identity (6). Binding specificity, however, is sensitive to subtle amino acid differences, and the transfer of substrate between related enzymes is prone to errors when sequence identity is below 65-80% (7-9). These thresholds vary from case to case: Some orthologs will maintain identical functions down to 25% sequence identify (9), whereas paralogs can take on highly diverse activities (10). Other difficulties that plague annotation transfer between homologs are that individual small molecules may each bind to multiple and distinct molecular pockets (11), that different residues can support similar chemistries (12), and that activity can vary even when catalytic residues are conserved (13-18). To raise annotation accuracy, Structural Genomics (19) made structural information widely available and spurred the development of annotation methods dependent on local chemical and physical environments (20), sequence and structural comparisons (21), or 3D templates (22). In the case of the latter, these methods search between proteins for local structural similarities over a few signature residues that represent the telltale parts of a functional site, so-called "3D templates" (3,14,18,(22)(23)(24). The residue composition of 3D templates is critical, however, and derived from experiments (25) or from analyses of functional sites and determinants (14,15,26). The sensitivity and specificity of template-based annotations still needs to be established experimentally (27, 28), but retrospective controls suggest they often predict enzyme catalytic activity (14,16,17,29,30...
High-throughput Structural Genomics yields many new protein structures without known molecular function. This study aims to uncover these missing annotations by globally comparing select functional residues across the structural proteome. First, Evolutionary Trace Annotation, or ETA, identifies which proteins have local evolutionary and structural features in common; next, these proteins are linked together into a proteomic network of ETA similarities; then, starting from proteins with known functions, competing functional labels diffuse link-by-link over the entire network. Every node is thus assigned a likelihood z-score for every function, and the most significant one at each node wins and defines its annotation. In high-throughput controls, this competitive diffusion process recovered enzyme activity annotations with 99% and 97% accuracy at half-coverage for the third and fourth Enzyme Commission (EC) levels, respectively. This corresponds to false positive rates 4-fold lower than nearest-neighbor and 5-fold lower than sequence-based annotations. In practice, experimental validation of the predicted carboxylesterase activity in a protein from Staphylococcus aureus illustrated the effectiveness of this approach in the context of an increasingly drug-resistant microbe. This study further links molecular function to a small number of evolutionarily important residues recognizable by Evolutionary Tracing and it points to the specificity and sensitivity of functional annotation by competitive global network diffusion. A web server is at http://mammoth.bcm.tmc.edu/networks.
The Six3 and Rx3 homeodomain proteins are essential for the specification and proliferation of forebrain and retinal precursor cells of the vertebrate brain, and the regulatory networks that control their expression are beginning to be elucidated. We identify the zebrafish lmo4b gene as a negative regulator of forebrain growth that acts via restriction of six3 and rx3 expression during early segmentation stages. Loss of lmo4b by morpholino knockdown results in enlargement of the presumptive telencephalon and optic vesicles and an expansion of the post-gastrula expression domains of six3 and rx3. Overexpression of lmo4b by mRNA injection causes complementary phenotypes, including a reduction in the amount of anterior neural tissue, especially in the telencephalic, optic and hypothalamic primordia, and a dosage-sensitive reduction in six3 and rx3 expression. We suggest that lmo4b activity is required at the neural boundary to restrict six3b expression, and later within the neural plate to for attenuation of rx3 expression independently of its effect on six3 transcription. We propose that lmo4b has an essential role in forebrain development as a modulator of six3 and rx3 expression, and thus indirectly influences neural cell fate commitment, cell proliferation and tissue growth in the anterior CNS.
Summary Natural selection for specific functions places limits upon the amino acid substitutions a protein can accept. Mechanisms that expand the range of tolerable amino acid substitutions include chaperones that can rescue destabilized proteins and additional, stability enhancing substitutions. Here, we present an alternative mechanism that is simple and uses a frequently encountered network motif. Computational and experimental evidence show that the self-correcting, negative feedback gene regulation motif increases repressor expression in response to deleterious mutations and thereby precisely restores repression of a target gene. Furthermore, this ability to rescue repressor function is observable across the Eubacteria Kingdom through the greater accumulation of amino acid substitutions in negative feedback transcription factors compared to genes they control. We propose that negative feedback represents a self-contained genetic canalization mechanism that preserves phenotype while permitting access to a wider range of functional genotypes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.