2021
DOI: 10.3389/fbioe.2020.627335
|View full text |Cite
|
Sign up to set email alerts
|

Predicting Cell Wall Lytic Enzymes Using Combined Features

Abstract: Due to the overuse of antibiotics, people are worried that existing antibiotics will become ineffective against pathogens with the rapid rise of antibiotic-resistant strains. The use of cell wall lytic enzymes to destroy bacteria has become a viable alternative to avoid the crisis of antimicrobial resistance. In this paper, an improved method for cell wall lytic enzymes prediction was proposed and the amino acid composition (AAC), the dipeptide composition (DC), the position-specific score matrix auto-covarian… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 63 publications
(65 reference statements)
0
3
0
Order By: Relevance
“…The SMOTE algorithm uses a combination of oversampling the minority class and undersampling the majority class for better classification performance. The method has been used successfully in various studies to eliminate the class imbalance [39] , [40] , [41] . Finally, each sortase class consisted of 277 sequences in the balanced training dataset, except for class E, which contained 276 sequences.…”
Section: Methodsmentioning
confidence: 99%
“…The SMOTE algorithm uses a combination of oversampling the minority class and undersampling the majority class for better classification performance. The method has been used successfully in various studies to eliminate the class imbalance [39] , [40] , [41] . Finally, each sortase class consisted of 277 sequences in the balanced training dataset, except for class E, which contained 276 sequences.…”
Section: Methodsmentioning
confidence: 99%
“…First, the F- score of each feature (connection) in the training set was calculated and ranked in descending order as in previous studies [ 50 ]. Second, a subset of the original training set was generated by including the features (connections) with the top N F- scores successively, where N = 1, 2,…, m , and m is the total number of features (connections) (23 × 22/2) [ 51 ]. Then, a grid search using leave-one-out cross-validation (LOOCV) was carried out to find the optimized values of (C, γ), where C denotes the penalty parameter, and γ represents the kernel width parameter [ 52 ].…”
Section: Methodsmentioning
confidence: 99%
“…Feature selection not only aids in reducing the dimensionality of genomic data but also enhances model interpretability and reduces overfitting, making it an invaluable component of MLbased genomic selection strategies [12,13]. Machine learning methods, such as Random Forest, have been utilized for SNPs screening to enhance model predictive capability [14]. Therefore, using appropriate feature selection methods can quickly eliminate the impact of unnecessary features from genome-wide markers in rice, thereby enhancing the accuracy of subsequent analyses and increasing computational speed [15,16].…”
Section: Introductionmentioning
confidence: 99%