2021
DOI: 10.1007/s10822-021-00418-1
|View full text |Cite
|
Sign up to set email alerts
|

StackHCV: a web-based integrative machine-learning framework for large-scale identification of hepatitis C virus NS5B inhibitors

Abstract: Fast and accurate identification of inhibitors with potency against HCV NS5B polymerase is currently a challenging task. As conventional experimental methods is the gold standard method for the design and development of new HCV inhibitors, they often require costly investment of time and resources. In this study, we develop a novel machine learning-based metapredictor (termed StackHCV) for accurate and large-scale identification of HCV inhibitors. Unlike the existing method, which is based on single-feature-ba… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
12
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
8

Relationship

3
5

Authors

Journals

citations
Cited by 21 publications
(13 citation statements)
references
References 62 publications
0
12
0
1
Order By: Relevance
“…Unlike other ensemble learning strategies, this strategy enables an automatic integration of different ML classifiers in order to construct a single robust prediction model 23 . The stacked strategy has successfully achieve better performance as compared with its constituent baseline models 23 , 24 , 27 , 30 , 31 . The stacking strategy consists of two main steps, while the corresponding models at each step are referred to as baseline and meta models, respectively.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Unlike other ensemble learning strategies, this strategy enables an automatic integration of different ML classifiers in order to construct a single robust prediction model 23 . The stacked strategy has successfully achieve better performance as compared with its constituent baseline models 23 , 24 , 27 , 30 , 31 . The stacking strategy consists of two main steps, while the corresponding models at each step are referred to as baseline and meta models, respectively.…”
Section: Methodsmentioning
confidence: 99%
“…The feature subset achieving the highest Matthews correlation coefficient (MCC) was considered as the optimal feature subset. The implementation of these classifiers in the two-step feature selection strategy is the same as used in our previous studies 18 , 31 , 38 41 …”
Section: Methodsmentioning
confidence: 99%
“…In this phase, we applied 12 well-known feature encodings to extract samples in the AR-TRN dataset, including CKD, CKDExt, CKDGraph, AP2D, KR, MACCS, Circle, Estate, Hybrid, PubChem, FP4C, and FP4. These molecular descriptors are widely used to represent several types of inhibitors [ 41 , 45 48 ]. In the meanwhile, 13 popular ML algorithms were selected for the construction of baseline models, including RF, AdaBoost (ADA), light gradient boosting machine (LGBM), partial least squares (PLS), multilayer perceptron (MLP), naive Bayes (NB), decision tree (DT), extremely randomized trees (ET), extreme gradient boosting (XGB), k-nearest neighbor (KNN), logistic regression (LR), support vector machine (SVM) combined with linear (SVMLN) and radial basis function (SVMRBF) kernels.…”
Section: Methodsmentioning
confidence: 99%
“…Unlike other conventional ensemble strategies, the stacking strategy integrates the strengths of different predictive models without human intervention to generate the final meta-predictor 44 47 . To date, numerous previous studies have indicated that the final meta-predictor can potentially attain a more stable predictive performance 48 50 . The overall workflow for the development of StackPR contains three major steps (i.e., baseline model construction, new feature vector generation, and meta-predictor development) as provided in the paragraphs hereafter (Fig.…”
Section: Methodsmentioning
confidence: 99%