The development of intracellular ice in the bodies of
cold-blooded
living organisms may cause them to die. These species yield antifreeze
proteins (AFPs) to live in subzero temperature environments. Additionally,
AFPs are implemented in biotechnological, industrial, agricultural,
and medical fields. Machine learning-based predictors were presented
for AFP identification. However, more accurate predictors are still
highly desirable for boosting the AFP prediction. This work presents
a novel approach, named AFP-SPTS, for the correct prediction of AFPs.
We explored the discriminative features with four schemes, namely,
dipeptide deviation from the expected mean (DDE), reduced amino acid
alphabet (RAAA), grouped dipeptide composition (GDPC), and a novel
representative method, called pseudo-position-specific scoring matrix
tri-slicing (PseTS-PSSM). Considering the advantages of ensemble learning
strategy, we fused each feature vector into different combinations
and trained the models with five machine learning algorithms, i.e.,
multilayer perceptron (MLP), extremely randomized tree (ERT), decision
tree (DT), random forest (RF), and AdaBoost. Among all models, PseTS-PSSM
+ RAAA with an extremely randomized tree attained the best outcomes.
The proposed predictor (AFP-SPTS) boosted the accuracies of AFPs in
the literature by 1.82 and 4.1%.