Impressive advances in the applications of bioinformatics for protein structure prediction coupled with growing structural databases on one hand and the insurmountable time-scale problem with ab initio computational methods on the other continue to raise doubts whether a computational solution to the protein folding problem--categorized as an NP-hard problem--is within reach in the near future. Combining some specially designed biophysical filters and vector algebra tools with ab initio methods, we present here a promising computational pathway for bracketing native-like structures of small alpha helical globular proteins departing from secondary structural information. The automated protocol is initiated by generating multiple structures around the loops between secondary structural elements. A set of knowledge-based biophysical filters namely persistence length and radius of gyration, developed and calibrated on approximately 1000 globular proteins, is introduced to screen the trial structures to filter out improbable candidates for the native and reduce the size of the library of probable structures. The ensemble so generated encompasses a few structures with native-like topology. Monte Carlo optimizations of the loop dihedrals are then carried out to remove steric clashes. The resultant structures are energy minimized and ranked according to a scoring function tested previously on a series of decoy sets vis-a-vis their corresponding natives. We find that the 100 lowest energy structures culled from the ensemble of energy optimized trial structures comprise at least a few to within 3-5 angstroms of the native. Thus the formidable "needle in a haystack" problem is narrowed down to finding an optimal solution amongst a computationally tractable number of alternatives. Encouraging results obtained on twelve small alpha helical globular proteins with the above outlined pathway are presented and discussed.
We describe here an energy based computer software suite for narrowing down the search space of tertiary structures of small globular proteins. The protocol comprises eight different computational modules that form an automated pipeline. It combines physics based potentials with biophysical filters to arrive at 10 plausible candidate structures starting from sequence and secondary structure information. The methodology has been validated here on 50 small globular proteins consisting of 2–3 helices and strands with known tertiary structures. For each of these proteins, a structure within 3–6 Å RMSD (root mean square deviation) of the native has been obtained in the 10 lowest energy structures. The protocol has been web enabled and is accessible at .
Arriving at the native conformation of a polypeptide chain characterized by minimum most free energy is a problem of long standing interest in protein structure prediction endeavors. Owing to the computational requirements in developing free energy estimates, scoring functions -energy based or statistical -have received considerable renewed attention in recent years for distinguishing native structures of proteins from non-native like structures. Several cleverly designed decoy sets, CASP (Critical Assessment of Techniques for Protein Structure Prediction) structures and homology based internet accessible three dimensional model builders are now available for validating the scoring functions. We describe here an all-atom energy based empirical scoring function and examine its performance on a wide series of publicly available decoys. Barring two protein sequences where native structure is ranked second and seventh, native is identified as the lowest energy structure in 67 protein sequences from among 61,659 decoys belonging to 12 different decoy sets. We further illustrate a potential application of the scoring function in bracketing native-like structures of two small mixed alpha/beta globular proteins starting from sequence and secondary structural information. The scoring function has been web enabled at
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.