Motivation: Intrinsically disordered proteins play a crucial role in numerous regulatory processes. Their abundance and ubiquity combined with a relatively low quantity of their annotations motivate research toward the development of computational models that predict disordered regions from protein sequences. Although the prediction quality of these methods continues to rise, novel and improved predictors are urgently needed.Results: We propose a novel method, named MFDp (Multilayered Fusion-based Disorder predictor), that aims to improve over the current disorder predictors. MFDp is as an ensemble of 3 Support Vector Machines specialized for the prediction of short, long and generic disordered regions. It combines three complementary disorder predictors, sequence, sequence profiles, predicted secondary structure, solvent accessibility, backbone dihedral torsion angles, residue flexibility and B-factors. Our method utilizes a custom-designed set of features that are based on raw predictions and aggregated raw values and recognizes various types of disorder. The MFDp is compared at the residue level on two datasets against eight recent disorder predictors and top-performing methods from the most recent CASP8 experiment. In spite of using training chains with ≤25% similarity to the test sequences, our method consistently and significantly outperforms the other methods based on the MCC index. The MFDp outperforms modern disorder predictors for the binary disorder assignment and provides competitive real-valued predictions. The MFDp's outputs are also shown to outperform the other methods in the identification of proteins with long disordered regions.Availability: http://biomine.ece.ualberta.ca/MFDp.htmlSupplementary information: Supplementary data are available at Bioinformatics online.Contact: lkurgan@ece.ualberta.ca
Sequence-based prediction of protein secondary structure (SS) enjoys wide-spread and increasing use for the analysis and prediction of numerous structural and functional characteristics of proteins. The lack of a recent comprehensive and large-scale comparison of the numerous prediction methods results in an often arbitrary selection of a SS predictor. To address this void, we compare and analyze 12 popular, standalone and high-throughput predictors on a large set of 1975 proteins to provide in-depth, novel and practical insights. We show that there is no universally best predictor and thus detailed comparative studies are needed to support informed selection of SS predictors for a given application. Our study shows that the three-state accuracy (Q3) and segment overlap (SOV3) of the SS prediction currently reach 82% and 81%, respectively. We demonstrate that carefully designed consensus-based predictors improve the Q3 by additional 2% and that homology modeling-based methods are significantly better by 1.5% Q3 than ab initio approaches. Our empirical analysis reveals that solvent exposed and flexible coils are predicted with a higher quality than the buried and rigid coils, while inverse is true for the strands and helices. We also show that longer helices are easier to predict, which is in contrast to longer strands that are harder to find. The current methods confuse 1-6% of strand residues with helical residues and vice versa and they perform poorly for residues in the β- bridge and 3(10)-helix conformations. Finally, we compare predictions of the standalone implementations of four well-performing methods with their corresponding web servers.
The exact mechanisms of prion misfolding and factors that predispose an individual to prion diseases are largely unknown. Our approach to identifying candidate factors in-silico relies on contrasting the C-terminal domain of PrPC sequences from two groups of vertebrate species: those that have been found to suffer from prion diseases, and those that have not. We propose that any significant differences between the two groups are candidate factors that may predispose individuals to develop prion disease, which should be further analyzed by wet-lab investigations. Using an array of computational methods we identified possible point mutations that could predispose PrPC to misfold into PrPSc. Our results include confirmatory findings such as the V210I mutation, and new findings including P137M, G142D, G142N, D144P, K185T, V189I, H187Y and T191P mutations, which could impact structural stability. We also propose new hypotheses that give insights into the stability of helix-2 and -3. These include destabilizing effects of Histidine and T188-T193 segment in helix-2 in the disease-prone prions, and a stabilizing effect of Leucine on helix-3 in the disease-resistant prions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.