2011
DOI: 10.1093/bioinformatics/btr579
|View full text |Cite
|
Sign up to set email alerts
|

Predicting residue–residue contacts using random forest models

Abstract: Motivation: Protein residue-residue contact prediction can be useful in predicting protein 3D structures. Current algorithms for such a purpose leave room for improvement. Results: We develop ProC_S3, a set of Random Forest algorithmbased models, for predicting residue-residue contact maps. The models are constructed based on a collection of 1490 nonredundant, high-resolution protein structures using >1280 sequencebased features. A new amino acid residue contact propensity matrix and a new set of seven amino a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
46
0
1

Year Published

2012
2012
2017
2017

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 53 publications
(48 citation statements)
references
References 37 publications
1
46
0
1
Order By: Relevance
“…RF has been extensively used in bioinformatics applications, including the prediction of disease-causing mutations [8,47-49]. The popularity of RF is due in part to its simplicity with no fine-tuning of parameters required and in part to its speed of classification, which is often faster than an equivalent SVM model [50]. In this study, as we are combining multiple classification models and evaluating different training sets, this advantage of RF (limited tuning required) over SVM (tuning required) was considerable.…”
Section: Methodsmentioning
confidence: 99%
“…RF has been extensively used in bioinformatics applications, including the prediction of disease-causing mutations [8,47-49]. The popularity of RF is due in part to its simplicity with no fine-tuning of parameters required and in part to its speed of classification, which is often faster than an equivalent SVM model [50]. In this study, as we are combining multiple classification models and evaluating different training sets, this advantage of RF (limited tuning required) over SVM (tuning required) was considerable.…”
Section: Methodsmentioning
confidence: 99%
“…Our previous work5152 has indicated that random forest and Support Vector Machine (SVM) usually demonstrate good performance with various datasets. This finding is consistent with the recently published work of Fernandez-Delgado et al 53,.…”
Section: Methodsmentioning
confidence: 99%
“…In this work, random forest was applied because it is generally more robust than SVM, which is a parameter-sensitive method and requires a long period of time to optimize parameters. The random forest package in R software was used in this study, as in our previous study52. The ntree parameter was set to 5,000, which historically has demonstrated good performance5152, and the importance was set to TRUE.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Li et al [40] developed ProC_S3, based on a set of Random Forest algorithm based models using 1287 sequence-based features. Marks et al [43] use a global model of maximum entropy constrained by correlated mutations from multiple sequence alignments.…”
Section: Contact Map Predictionmentioning
confidence: 99%