2019
DOI: 10.1063/1.5113848
|View full text |Cite
|
Sign up to set email alerts
|

Prediction of amyloid aggregation rates by machine learning and feature selection

Abstract: A novel data-based machine learning algorithm for predicting amyloid aggregation rates is reported in this paper. Based on a highly nonlinear projection from 16 intrinsic features of a protein and 4 extrinsic features of the environment to the protein aggregation rate, a feedforward fully connected neural network (FCN) with one hidden layer is trained on a dataset composed of 21 different kinds of amyloid proteins and tested on 4 rest proteins. FCN shows a much better performance than traditional algorithms, s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 13 publications
(5 citation statements)
references
References 66 publications
0
5
0
Order By: Relevance
“…For drug development, feature selection and classifiers are used to predict functional classes of newly generated protein sequences [ 25 ] and protein inhibitors and substrates [ 26 ]. In clinical tests, they are used to predict the rate of amyloid aggregation [ 27 ] and the production of high antivasoactive peptides [ 28 ]. In summary, the main idea of these studies is to narrow the feature set by filtering the relevant data through a combination of single or multiple feature algorithms [ 29 ] and then enter the new feature set into the classifier to classify and find the indicator that is most closely related to the disease.…”
Section: Related Workmentioning
confidence: 99%
“…For drug development, feature selection and classifiers are used to predict functional classes of newly generated protein sequences [ 25 ] and protein inhibitors and substrates [ 26 ]. In clinical tests, they are used to predict the rate of amyloid aggregation [ 27 ] and the production of high antivasoactive peptides [ 28 ]. In summary, the main idea of these studies is to narrow the feature set by filtering the relevant data through a combination of single or multiple feature algorithms [ 29 ] and then enter the new feature set into the classifier to classify and find the indicator that is most closely related to the disease.…”
Section: Related Workmentioning
confidence: 99%
“…The proteins anchored in the membrane can reveal the sensitivity of the polypeptide chain to its surroundings [ 6 ]. The aggregation of proteins can be treated as partial non-solubility, which is sometimes necessary in order to ensure biological activity, as well as loss of activity, as is observed in the misfolded proteins [ 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 ]—of which the amyloids are spectacular examples [ 19 ]. Recently, the intensively applied machine learning technique (applied in PROSO, for example) enables the prediction of protein solubility in heterologous expression in Escherica coli [ 20 , 21 ].…”
Section: Introductionmentioning
confidence: 99%
“…Computational approaches have been developed to estimate mechanistic and kinetic characteristics for better comprehension and prediction of colloidal instability and aggregation. Mechanistic tools aid in screening and minimizing APRs during the discovery phase (Kuhn et al, 2017;Prabakaran et al, 2017Prabakaran et al, , 2020van der Kant et al, 2017;Gil-Garcia et al, 2018;Rawat et al, 2018;Bauer et al, 2020;Ebo et al, 2020;Shahfar et al, 2022), while kinetic predictors Frontiers in Molecular Biosciences frontiersin.org 13 estimate aggregation rates, crucial for liquid formulation development meeting regulatory requirements for shelf life (Rawat et al, 2019;Yang et al, 2019;Santos et al, 2020). Machine Learning (ML) can train kinetic models using extensive data sets with experimental and sequence/structure information (Rawat et al, 2019;Yang et al, 2019), facilitating prediction of optimal formulation compositions (pH, salt, excipients) for minimal kinetics.…”
Section: In Silico Assessments In Early Developmentmentioning
confidence: 99%