2022
DOI: 10.21203/rs.3.rs-1578020/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

PhyloTransformer: A Self-supervised Discriminative Model for Mutation Prediction Based on a Multi-head Self-attention Mechanism

Abstract: Although coronaviruses have RNA proofreading functions, a large number of variants still exist as quasispecies. Identified coronaviruses might just be the tip of the iceberg, and potentially more fatal variants of concern (VOCs) may emerge over time. These VOCs may exhibit increased pathogenicity, infectivity, transmissibility, angiotensin-converting enzyme 2 (ACE2) binding affinity, and antigenicity, causing an increased threat to public health. In this article, we developed PhyloTransformer, a Transformer-ba… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…DL language models have also been applied for protein prediction tasks, as common protein motifs and domains can be analogized to words, phrases, and sentences in human language [3841]. A Transformer-based discriminative model was trained with SARS-CoV-2 genetic sequences to predict potential mutations that may lead to enhanced virus transmissibility [42]. Motivated by the success of masked language models such as BERT [43], we design a pretrained protein language model for comprehensive variant prediction, aiming to simulate circulating viral mutation and predict of potentially risky variants.…”
Section: Current State Of the Artmentioning
confidence: 99%
See 1 more Smart Citation
“…DL language models have also been applied for protein prediction tasks, as common protein motifs and domains can be analogized to words, phrases, and sentences in human language [3841]. A Transformer-based discriminative model was trained with SARS-CoV-2 genetic sequences to predict potential mutations that may lead to enhanced virus transmissibility [42]. Motivated by the success of masked language models such as BERT [43], we design a pretrained protein language model for comprehensive variant prediction, aiming to simulate circulating viral mutation and predict of potentially risky variants.…”
Section: Current State Of the Artmentioning
confidence: 99%
“…Language models have been used to decipher the genetic sequences of virus. For example, a Transformer-based discriminative model was trained with SARS-CoV-2 genetic sequences to predict potential mutations that may lead to enhanced virus transmissibility (Wu et al 2021). Language models have also been applied for protein prediction tasks, as common protein motifs and domains can be analogized to words, phrases, and sentences in human language (Ofer et al 2021; Trifonov 2009; Strait and Dewey 1996; Yu et al 2019).…”
Section: Current State Of the Artmentioning
confidence: 99%