CUPSAT (Cologne University Protein Stability Analysis Tool) is a web tool to analyse and predict protein stability changes upon point mutations (single amino acid mutations). This program uses structural environment specific atom potentials and torsion angle potentials to predict ΔΔG, the difference in free energy of unfolding between wild-type and mutant proteins. It requires the protein structure in Protein Data Bank format and the location of the residue to be mutated. The output consists information about mutation site, its structural features (solvent accessibility, secondary structure and torsion angles), and comprehensive information about changes in protein stability for 19 possible substitutions of a specific amino acid mutation. Additionally, it also analyses the ability of the mutated amino acids to adapt the observed torsion angles. Results were tested on 1538 mutations from thermal denaturation and 1603 mutations from chemical denaturation experiments. Several validation tests (split-sample, jack-knife and k-fold) were carried out to ensure the reliability, accuracy and transferability of the prediction method that gives >80% prediction accuracy for most of these validation tests. Thus, the program serves as a valuable tool for the analysis of protein design and stability. The tool is accessible from the link .
Sequence composition was found to provide sufficient information to predict the probability of its binding to DNA with nearly 69% sensitivity at 64% accuracy for the considered proteins; sequence neighbourhood and solvent accessibility information were sufficient to make binding site predictions with 40% sensitivity at 79% accuracy. Detailed analysis of binding residues shows that some three- and five-residue segments frequently bind to DNA and that solvent accessibility plays a major role in binding. Although, binding behaviour was not associated with any particular secondary structure, there were interesting exceptions at the residue level. Over-representation of some residues in the binding sites was largely lost at the total sequence level, but a different kind of compositional preference was observed in DNA-binding proteins.
RNA-binding proteins (RBPs) play key roles in post-transcriptional control of gene expression, which, along with transcriptional regulation, is a major way to regulate patterns of gene expression during development. Thus, the identification and prediction of RNA binding sites is an important step in comprehensive understanding of how RBPs control organism development. Combining evolutionary information and support vector machine (SVM), we have developed an improved method for predicting RNA binding sites or RNA interacting residues in a protein sequence. The prediction models developed in this study have been trained and tested on 86 RNA binding protein chains and evaluated using fivefold cross validation technique. First, a SVM model was developed that achieved a maximum Matthew's correlation coefficient (MCC) of 0.31. The performance of this SVM model further improved the MCC from 0.31 to 0.45, when multiple sequence alignment in the form of PSSM profiles was used as input to the SVM, which is far better than the maximum MCC achieved by previous methods (0.41) on the same dataset. In addition, SVM models were also developed on an alternative dataset that contained 107 RBP chains. Utilizing PSSM as input information to the SVM, the training/testing on this alternate dataset achieved a maximum MCC of 0.32. Conclusively, the prediction performance of SVM models developed in this study is better than the existing methods on the same datasets. A web server 'Pprint' was also developed for predicting RNA binding residues in a protein sequence which is freely available at http://www.imtech.res.in/raghava/pprint/.
Release 4.0 of ProTherm, thermodynamic database for proteins and mutants, contains approximately 14,500 numerical data (approximately 450% of the first version) of several thermodynamic parameters along with experimental methods and conditions, and structural, functional and literature information. The sequence and structural information of proteins is connected with thermodynamic data through links between entries in Protein Data Bank, Protein Information Resource and SWISS-PROT and the data in ProTherm. We have separated the Gibbs free energy change obtained at extrapolated temperature from the data on denaturation temperature measured by the thermal denaturation method. We have added the statistics of amino acid replacements and links to homologous structures to each protein. Further, we have improved the search and display options to enhance search capability through the web interface. ProTherm is freely available at http://gibk26. bse.kyutech.ac.jp/jouhou/Protherm/protherm.html.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.