2023
DOI: 10.1021/acs.jcim.2c01317
|View full text |Cite
|
Sign up to set email alerts
|

Serverless Prediction of Peptide Properties with Recurrent Neural Networks

Abstract: We present three deep learning sequence-based prediction models for peptide properties including hemolysis, solubility, and resistance to nonspecific interactions that achieve comparable results to the state-of-the-art models. Our sequencebased solubility predictor, MahLooL, outperforms the current state-of-the-art methods for short peptides. These models are implemented as a static website without the use of a dedicated server or cloud computing. Web-based models like this allow for accessible and effective r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

1
31
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 22 publications
(32 citation statements)
references
References 73 publications
(96 reference statements)
1
31
0
Order By: Relevance
“…Following the encoding procedure, we split each data set into three nonoverlapping subsets: a training set (to train the model) consisting of 81% of the data set, a validation set (for hyperparameter tuning) consisting of 9% of the data set, and a test set (to benchmark the model’s performance on unseen data) consisting of 10% of the data set. This specific train–validation–test split of 81%–9%–10% has been selected to ensure a proper comparison between our approach and the previous methodologies . Data augmentations, if any (such as in the solubility task), are then applied to the training set, while the validation and test sets remain unchanged.…”
Section: Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…Following the encoding procedure, we split each data set into three nonoverlapping subsets: a training set (to train the model) consisting of 81% of the data set, a validation set (for hyperparameter tuning) consisting of 9% of the data set, and a test set (to benchmark the model’s performance on unseen data) consisting of 10% of the data set. This specific train–validation–test split of 81%–9%–10% has been selected to ensure a proper comparison between our approach and the previous methodologies . Data augmentations, if any (such as in the solubility task), are then applied to the training set, while the validation and test sets remain unchanged.…”
Section: Methodsmentioning
confidence: 99%
“…The training data for the nonfouling task primarily consist of shorter sequences in the range of 2–20 residues. The data set employed for this task consists of instances of negative examples that are predominantly associated with insoluble peptides, which could lead to an increase in accuracy if only soluble peptides are compared . Our predictive model achieves an accuracy of 70.018% with augmentations for the solubility task.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations