Inferring protein functions from structures is a challenging task, as a large number of orphan protein structures from structural genomics project are now solved without their biochemical functions characterized. For proteins binding to similar substrates or ligands and carrying out similar functions, their binding surfaces experience similar physicochemical constraints, and hence the sets of allowed and forbidden residue substitutions are similar. However, it is difficult to isolate such selection pressure due to protein function from selection pressure due to protein folding, and evolutionary relationship reflected by global sequence and structure similarities between proteins is often unreliable for inferring protein function. We have developed a method, called pevoSOAR (Pocket-based EVOlutionary Search Of Amino acid Residues), for predicting protein functions by solving the problem of uncovering residue substitution pattern due to protein function and separating it from substitution pattern due to protein folding. We incorporate evolutionary information specific to an individual binding region and match local surfaces at large scale to identify those with similar functions. Our pevoSOAR method also computes a profile which characterizes protein binding activities that may involve multiple substrates or ligands. We show that our method can be used to predict enzyme functions with accuracy. It can also assess enzyme binding specificity and promiscuity. In an objective large scale test of 100 enzyme families with thousands of structures, our predictions are found to be sensitive and specific: At the stringent specificity level of 99.98%, we can correctly predict enzyme functions for 80.55% of the proteins. The overall area under the Receiver Operating Characteristic curve measuring the performance of our prediction is 0.955, close to the perfect value of 1.00. The best Matthews Coefficient is 86.6%. Our method also works well in predicting the biochemical functions of orphan proteins from structural genomics project.
Accurate free energy estimation is essential for RNA structure prediction. The widely used Turner's energy model works well for nested structures. For pseudoknotted RNAs, however, there is no effective rule for estimation of loop entropy and free energy. In this work we present a new free energy estimation method, termed the pseudoknot predictor in three-dimensional space (pk3D), which goes beyond Turner's model. Our approach treats nested and pseudoknotted structures alike in one unifying physical framework, regardless of how complex the RNA structures are. We first test the ability of pk3D in selecting native structures from a large number of decoys for a set of 43 pseudoknotted RNA molecules, with lengths ranging from 23 to 113. We find that pk3D performs slightly better than the Dirks and Pierce extension of Turner's rule. We then test pk3D for blind secondary structure prediction, and find that pk3D gives the best sensitivity and comparable positive predictive value (related to specificity) in predicting pseudoknotted RNA secondary structures, when compared with other methods. A unique strength of pk3D is that it also generates spatial arrangement of structural elements of the RNA molecule. Comparison of three-dimensional structures predicted by pk3D with the native structure measured by nuclear magnetic resonance or X-ray experiments shows that the predicted spatial arrangement of stems and loops is often similar to that found in the native structure. These close-tonative structures can be used as starting points for further refinement to derive accurate three-dimensional structures of RNA molecules, including those with pseudoknots.
We have developed an efficient strategy to a skeletally diverse chemical library, which entailed a sequence of enyne cycloisomerization, [4 þ 2] cycloaddition, alkene dihydroxylation, and diol carbamylation. Using this approach, only 16 readily available building blocks were needed to produce a representative 191-member library, which displayed broad distribution of molecular shapes and excellent physicochemical properties. This library further enabled identification of a small molecule, which effectively suppressed glycolytic production of ATP and lactate in CHO-K1 cell line, representing a potential lead for the development of a new class of glycolytic inhibitors.diversity-oriented synthesis | glycolysis | skeletal diversity S tructurally diverse collections of small molecules provide a validated source of chemical probes for basic and translational biomedical research (1, 2). Variation of the scaffold architecture of such compound libraries is particularly desirable to enable identification of new bioactive chemical probes with higher probability and greater efficiency. High-throughput synthesis of skeletally diverse small-molecule libraries represents one of the most challenging aspects of diversity-oriented synthesis and requires identification of efficient reaction sequences that can rapidly convert a small subset of readily available compounds to a large number of skeletally diverse chemical entities for subsequent biomedical applications (3).Transition metal-catalyzed cycloisomerization of enynes represents a powerful method for structural diversification (4-6). Our group previously demonstrated that various reaction topologies could be controlled by a proper choice of the transition metal catalyst, as well as the functionalization of the starting enyne (7-10). Significant advances in this area can now enable incorporation of such transformations into synthetic strategies for the assembly of skeletally diverse chemical libraries (11)(12)(13)(14)(15)(16)(17)(18). However, several challenges remain to be addressed to facilitate access to high-diversity chemical libraries. Typically, multiple cycloisomerization precursors are manually assembled to yield different skeletal frameworks upon their cycloisomerizations. A smaller number of building blocks would minimize this laborious process and increase efficiency. Another limitation is the difficulty of subsequent diversification of cycloisomerization products, which is complicated by the lack of common functional groups and variable chemical reactivity of such compounds. Ideally, the cycloisomerization should provide access to products containing the same functional group to enable the next diversity-generating step, which should yield another common functional group. If such common and reactive functional groups are efficiently produced at every stage of the synthesis, this synthetic pathway can readily provide access to a structurally diverse library starting with only a small set of building blocks.We describe the development of a unique approach, which ha...
Predicting protein functions from structures is an important and challenging task. Although proteins are often thought to be packed as tightly as solids, closer examination based on geometric computation reveals that they contain numerous voids and pockets. Most of them are of random nature, but some are binding sites providing surfaces to interact with other molecules. A promising approach for function inference is to infer functions through discovery of similarity in local binding pockets, as proteins binding to similar substrates/ligands and carrying out similar functions have similar physical constraints for binding and reactions. In this chapter, we describe computational methods to distinguish those surface pockets that are likely to be involved in important biological functions, and methods to identify key residues in these pockets. We further describe how to predict protein functions at large scale (millions) from structures by detecting binding surfaces similar in residue make-ups, shape and orientation. We also describe a Bayesian Monte Carlo method that can seperate selection pressure due to biological function from pressure due to protein folding. We show how this method can be used to reconstruct the evolutionary history of binding surfaces for detecting similar binding surfaces. In addition, we briefly discuss how the negative image of a binding pocket can be casted, and how such information can be used to facilitate drug discovery.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.