2018
DOI: 10.1038/s41598-018-19752-w
|View full text |Cite
|
Sign up to set email alerts
|

AmPEP: Sequence-based prediction of antimicrobial peptides using distribution patterns of amino acid properties and random forest

Abstract: Antimicrobial peptides (AMPs) are promising candidates in the fight against multidrug-resistant pathogens owing to AMPs’ broad range of activities and low toxicity. Nonetheless, identification of AMPs through wet-lab experiments is still expensive and time consuming. Here, we propose an accurate computational method for AMP prediction by the random forest algorithm. The prediction model is based on the distribution patterns of amino acid properties along the sequence. Using our collection of large and diverse … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

6
198
0
1

Year Published

2018
2018
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 237 publications
(205 citation statements)
references
References 35 publications
6
198
0
1
Order By: Relevance
“…Larger values of importance indicate stronger predictors, and values close to zero suggest the variable is not a good predictor. RF are popular because of their ability to deal with large numbers of covariates, non-linear associations, complex interactions and correlations between variables; RF have been used in many biomedical research fields [ 23 27 ]. In our RF variable importance (RFVI) analysis we converted all predictor variables into binomials, to avoid reported possible bias of RF when used with categorical variables with multiple levels, or correlated predictors [ 28 , 29 ].…”
Section: Methodsmentioning
confidence: 99%
“…Larger values of importance indicate stronger predictors, and values close to zero suggest the variable is not a good predictor. RF are popular because of their ability to deal with large numbers of covariates, non-linear associations, complex interactions and correlations between variables; RF have been used in many biomedical research fields [ 23 27 ]. In our RF variable importance (RFVI) analysis we converted all predictor variables into binomials, to avoid reported possible bias of RF when used with categorical variables with multiple levels, or correlated predictors [ 28 , 29 ].…”
Section: Methodsmentioning
confidence: 99%
“…AMP prediction has also been performed by RF methods, which are based on ensemble learning algorithms and work by multiple decision trees built on training data (Schierz, 2009). In terms of AMP prediction, studies have proposed a new tool called AmPEP (Table 1), as an attempt to develop a highly accurate RF classifier for AMP prediction based on pattern distribution and physicochemical properties (Bhadra et al, 2018). Its performance was comparable with other predictive tools and it showed higher values for particular parameters of comparison, even with a reduced number of features.…”
Section: Antibp2mentioning
confidence: 99%
“…Previous studies have organized amino acids into several physicochemical property groups [13,17]. As shown in Appendix A Table A3, seven physicochemical properties were used in the grouping: (1) charge, (2) hydrophobicity, (3) polarity, (4) polarizability, (5) secondary structure, (6) normalized van der Waals volume, and (7) solvent accessibility.…”
Section: Feature Constructionsmentioning
confidence: 99%
“…Several studies have been dedicated to the prediction of AMPs, such as AntiBP [5], AntiBP2 [6], CAMP [7], ClassAMP [8], AVPpred [9], AMPER [10], iAMP-2L [11], iAMPred [12], AmPEP [13], Figure 1A demonstrates the average AACs of AMPs and non-AMPs. Specifically, "L", "G", and "K" were abundant amino acids for AMPs, while "L", "A", and "G" were abundant amino acids for non-AMPs.…”
Section: Introductionmentioning
confidence: 99%