2017
DOI: 10.3390/ijms18071543
|View full text |Cite
|
Sign up to set email alerts
|

Prediction of Protein Hotspots from Whole Protein Sequences by a Random Projection Ensemble System

Abstract: Hotspot residues are important in the determination of protein-protein interactions, and they always perform specific functions in biological processes. The determination of hotspot residues is by the commonly-used method of alanine scanning mutagenesis experiments, which is always costly and time consuming. To address this issue, computational methods have been developed. Most of them are structure based, i.e., using the information of solved protein structures. However, the number of solved protein structure… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0
1

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
2
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 29 publications
(15 citation statements)
references
References 42 publications
(68 reference statements)
0
14
0
1
Order By: Relevance
“…Robetta [6,10], HotPoint [2,3,5], MAPPIS [1], KFC [11], SpotOn [12], PredHS [13], iPPHOT [14], the method using docking approach [15], HSPred [16], HEP [17]) or physico-chemical properties of their residues (e.g. method using random projection-based classifier [18], MSCA [19], method applying ensemble learning [20], DICFC [21], iFrag [22]). Most of the aforementioned algorithms require knowledge of the protein structure, which is a significant drawback of these methods because the protein structure has been determined only for a limited number of proteins.…”
Section: Methods Of Hot Spot Identificationmentioning
confidence: 99%
“…Robetta [6,10], HotPoint [2,3,5], MAPPIS [1], KFC [11], SpotOn [12], PredHS [13], iPPHOT [14], the method using docking approach [15], HSPred [16], HEP [17]) or physico-chemical properties of their residues (e.g. method using random projection-based classifier [18], MSCA [19], method applying ensemble learning [20], DICFC [21], iFrag [22]). Most of the aforementioned algorithms require knowledge of the protein structure, which is a significant drawback of these methods because the protein structure has been determined only for a limited number of proteins.…”
Section: Methods Of Hot Spot Identificationmentioning
confidence: 99%
“…Physicochemical features (e.g. hydrophobicity, hydrophilicity, polarity and average accessible surface area) from the AAindex1 database [ 14 ] are extracted to predict hot spots [ 15 , 16 ]. Position-specific scoring matrices (PSSMs) are a commonly used sequence feature that can be obtained from NCBI non-redundant databases via PSI-BLAST [ 17 ].…”
Section: Feature Engineeringmentioning
confidence: 99%
“…Hu et al [ 58 ] proposed a protein sequence-based model, in which the classifier is implemented by the improved IBK (Instance-based k means) algorithm of the k-nearest neighbors, which overcomes the shortcomings of the recent neighbor algorithm, which is sensitive to some data. Jiang et al [ 16 ] also proposed a sequence-based model, using the IBK algorithm to obtain a better random projection set through the training set.…”
Section: Machine Learning Approaches For Hot Spot Predictionmentioning
confidence: 99%
“…Most of the works focused on the hotspot predictions from a curated small partial dataset of the whole protein sequences [ 13 ]. In Jiang’s work [ 14 ], the issue of hotspot determination was approached from whole natural protein sequences, and a random projection ensemble system based on k nearest neighbor algorithm to identify hotspot residues by sequence information alone was developed. Experimental results showed that although this method did not perform well enough in the real applications of hotspots, it was very promising in the determination of hotspot residues from whole sequences.…”
Section: Machine Learning Related Researchesmentioning
confidence: 99%