SummaryThe plant SLAC1 anion channel controls turgor pressure in the aperture-defining guard cells of plant stomata, thereby regulating exchange of water vapor and photosynthetic gases in response to environmental signals such as drought or high levels of carbon dioxide. We determined the crystal structure of a bacterial homolog of SLAC1 at 1.20Å resolution, and we have used structure-inspired mutagenesis to analyze the conductance properties of SLAC1 channels. SLAC1 is a symmetric trimer composed from quasi-symmetric subunits, each having ten transmembrane helices arranged from helical hairpin pairs to form a central five-helix transmembrane pore that is gated by an extremely conserved phenylalanine residue. Conformational features suggest a mechanism for control of gating by kinase activation, and electrostatic features of the pore coupled with electrophysiological characteristics suggest that selectivity among different anions is largely a function of the energetic cost of ion dehydration.
The rate at which genome sequencing data is accruing demands enhanced methods for functional annotation and metabolism discovery. Solute binding proteins (SBPs) facilitate the transport of the first reactant in a metabolic pathway, thereby constraining the regions of chemical space and the chemistries that must be considered for pathway reconstruction. We describe high-throughput protein production and differential scanning fluorimetry platforms, which enabled the screening of 158 SBPs against a 189 component library specifically tailored for this class of proteins. Like all screening efforts, this approach is limited by the practical constraints imposed by construction of the library, i.e., we can study only those metabolites that are known to exist and which can be made in sufficient quantities for experimentation. To move beyond these inherent limitations, we illustrate the promise of crystallographic- and mass spectrometric-based approaches for the unbiased use of entire metabolomes as screening libraries. Together, our approaches identified 40 new SBP ligands, generated experiment-based annotations for 2084 SBPs in 71 isofunctional clusters, and defined numerous metabolic pathways, including novel catabolic pathways for the utilization of ethanolamine as sole nitrogen source and the use of d-Ala-d-Ala as sole carbon source. These efforts begin to define an integrated strategy for realizing the full value of amassing genome sequence data.
Large-scale activity profiling of enzyme superfamilies provides information about cellular functions as well as the intrinsic binding capabilities of conserved folds. Herein, the functional space of the ubiquitous haloalkanoate dehalogenase superfamily (HADSF) was revealed by screening a customized substrate library against >200 enzymes from representative prokaryotic species, enabling inferred annotation of ∼35% of the HADSF. An extremely high level of substrate ambiguity was revealed, with the majority of HADSF enzymes using more than five substrates. Substrate profiling allowed assignment of function to previously unannotated enzymes with known structure, uncovered potential new pathways, and identified isofunctional orthologs from evolutionarily distant taxonomic groups. Intriguingly, the HADSF subfamily having the least structural elaboration of the Rossmann fold catalytic domain was the most specific, consistent with the concept that domain insertions drive the evolution of new functions and that the broad specificity observed in HADSF may be a relic of this process.evolution | specificity | phosphatase | substrate screen | promiscuity S ince the first genomes were sequenced, there has been an exponential increase in the number of protein sequences deposited into databases worldwide. At the time of this writing the UniProtKB/TrEMBL database contains over 32 million protein sequences. Although this increase in sequence data has dramatically enhanced our understanding of the genomic organization of organisms, as the number of protein sequences grows, the proportion of firm functional assignments diminishes. Traditionally, methods of functional annotation involve comparing sequence identity between experimentally characterized proteins and newly sequenced ones, typically via BLAST (1). In cases where significant sequence similarity cannot be ascertained, proteins are annotated as "hypothetical" or "putative." Moreover, the decrease in sequence identity leads to an increased uncertainty in functional assignment, especially as the phylogenetic distance between organisms grows, limiting iso-functional ortholog discovery.As the number of newly sequenced genomes grows larger, more protein sequences are likely to be misannotated, oftentimes resulting in the propagation of incorrect functional annotation across newly identified sequences. To tackle the problem of unannotated or misannotated proteins, newer methods for computational assignment have been created with varying degrees of success (2). Although these methods outperform historical methods, continued improvement is necessary to ensure accurate annotation of function (2). A greater swath of functional space can be covered by screening substrates in a high-throughput manner on multiple enzymes from a family (3, 4). Family-wide substrate profiling offers a data-rich resource. The use of sparse screening of sequence space and a diversified library permits the determination of substrate specificity profiles to provide a familywide view of the range of substrates...
Assigning valid functions to proteins identified in genome projects is challenging, with over-prediction and database annotation errors major concerns1. We, and others2, are developing computation-guided strategies for functional discovery using “metabolite docking” to experimentally derived3 or homology-based4 three-dimensional structures. Bacterial metabolic pathways often are encoded by “genome neighborhoods” (gene clusters and/or operons), which can provide important clues for functional assignment. We recently demonstrated the synergy of docking and pathway context by “predicting” the intermediates in the glycolytic pathway in E. coli5. Metabolite docking to multiple binding proteins/enzymes in the same pathway increases the reliability of in silico predictions of substrate specificities because the pathway intermediates are structurally similar. We report that structure-guided approaches for predicting the substrate specificities of several enzymes encoded by a bacterial gene cluster allowed i) the correct prediction of the in vitro activity of a structurally characterized enzyme of unknown function (PDB 2PMQ), 2-epimerization of trans-4-hydroxy-L-proline betaine (tHyp-B) and cis-4-hydroxy-D-proline betaine (cHyp-B), and ii) the correct identification of the catabolic pathway in which Hyp-B 2-epimerase participates. The substrate-liganded pose predicted by virtual library screening (docking) was confirmed experimentally. The enzymatic activities in the predicted pathway were confirmed by in vitro assays and genetic analyses; the intermediates were identified by metabolomics; and repression of the genes encoding the pathway by high salt was established by transcriptomics, confirming the osmolyte role of tHyp-B. This study establishes the utility of structure-guide functional predictions to enable the discovery of new metabolic pathways.
Metabolic pathways in eubacteria and archaea often are encoded by operons and/or gene clusters (genome neighborhoods) that provide important clues for assignment of both enzyme functions and metabolic pathways. We describe a bioinformatic approach (genome neighborhood network; GNN) that enables large scale prediction of the in vitro enzymatic activities and in vivo physiological functions (metabolic pathways) of uncharacterized enzymes in protein families. We demonstrate the utility of the GNN approach by predicting in vitro activities and in vivo functions in the proline racemase superfamily (PRS; InterPro IPR008794). The predictions were verified by measuring in vitro activities for 51 proteins in 12 families in the PRS that represent ∼85% of the sequences; in vitro activities of pathway enzymes, carbon/nitrogen source phenotypes, and/or transcriptomic studies confirmed the predicted pathways. The synergistic use of sequence similarity networks3 and GNNs will facilitate the discovery of the components of novel, uncharacterized metabolic pathways in sequenced genomes.DOI: http://dx.doi.org/10.7554/eLife.03275.001
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.