2022
DOI: 10.3390/ijms232314683
|View full text |Cite
|
Sign up to set email alerts
|

Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks

Abstract: Predicting SARS-CoV-2 mutations is difficult, but predicting recurrent mutations driven by the host, such as those caused by host deaminases, is feasible. We used machine learning to predict which positions from the SARS-CoV-2 genome will hold a recurrent mutation and which mutations will be the most recurrent. We used data from April 2021 that we separated into three sets: a training set, a validation set, and an independent test set. For the test set, we obtained a specificity value of 0.69, a sensitivity va… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
11
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(12 citation statements)
references
References 78 publications
1
11
0
Order By: Relevance
“…The PHATE visualization of the 296,437 retained de novo iSNVs by genomic position display no clustering according to the mutational pattern, underscoring the optimal curation of the dataset (Figure 6A). Additionally, the substitution spectrum of this curated set shows a prevalence of C > T and G > T substitutions (Figure 6B), aligning with consensus iSNV patterns and known SARS-CoV-2 mutational trends (Moshiri et al 2023; Fumagalli et al 2023; Bloom et al 2023; Saldivar-Espinoza et al 2023).…”
Section: Resultssupporting
confidence: 71%
“…The PHATE visualization of the 296,437 retained de novo iSNVs by genomic position display no clustering according to the mutational pattern, underscoring the optimal curation of the dataset (Figure 6A). Additionally, the substitution spectrum of this curated set shows a prevalence of C > T and G > T substitutions (Figure 6B), aligning with consensus iSNV patterns and known SARS-CoV-2 mutational trends (Moshiri et al 2023; Fumagalli et al 2023; Bloom et al 2023; Saldivar-Espinoza et al 2023).…”
Section: Resultssupporting
confidence: 71%
“…This is mainly due to the fact that SARS-CoV-2 evolution through accumulation of escape genotypes is catalyzed by the presence of human anti-SARS-CoV-2 antibodies elicited by natural or artificial immunity [ 28 ]. Notably, the potential suitability of this proof-of concept study has been recently confirmed by another article, which used artificial neural networks for predicting recurrent mutations (rather than variants) in SARS-CoV-2 [ 29 ]. Briefly, the authors used a machine approach based on genomic rather than ecologic variables for identifying positions in SARS-CoV-2 genome where recurrent mutations are more likely to occur, ultimately achieving 0.79 sensitivity, 0.69 specificity and an overall area under the curve (AUC) of 0.80.…”
Section: Discussionmentioning
confidence: 91%
“…The continued evolution and almost unremittent emergence of new dominant and phylogenetically divergent SARS-CoV-2 Omicron sublineages presents a public health dilemma (i.e., a so-called “permacrisis”) and reinforces the need to develop predictive tools that may be capable of quickly identifying the escape of new variants from previous immunity. The feasibility and reliability of predicting the emergence of recurrent SARS-CoV-2 mutations has been recently demonstrated in the work of Saldivar-Espinoza et al [ 29 ], who developed a very complex mathematical model which was capable of anticipating nucleotide mutations and RNA reactivity with around 80% accuracy. We planned a similar approach for anticipating the emergence and spread of new variants of concerns, which does not detect divergence from a single nucleotide levels, but simplifies the entire process by including simple epidemiologic variables, such as date of emergence of new variants and number of COVID-19 cases and vaccinations, neither of which require complicated calculations or specific software.…”
Section: Discussionmentioning
confidence: 99%
“…The SARS-CoV-2 pandemic has motivated the collection of virus genomic sequences on an unprecedented scale, which has generated invaluable data on the genomic diversity of an RNA virus. From the ensemble of observed consensus sequences of infected hosts we can extract, for the first time, an exhaustive map of possible amino acid replacements in viral proteins that are tolerable for viable virus (Zhao et al 2022 ;Saldivar-Espinoza et al 2023 ). This brings into stark relief our limited understanding of the genotype/phenotype relationship, which is very detailed on some local functional aspects, such as spike protein antigenicity, but not much developed in general.…”
Section: Discussionmentioning
confidence: 99%