2020
DOI: 10.1016/j.bbrc.2020.09.010
|View full text |Cite
|
Sign up to set email alerts
|

Machine learning methods accurately predict host specificity of coronaviruses based on spike sequences alone

Abstract: Coronaviruses infect many animals, including humans, due to interspecies transmission. Three of the known human coronaviruses: MERS, SARS-CoV-1, and SARS-CoV-2, the pathogen for the COVID-19 pandemic, cause severe disease. Improved methods to predict host specificity of coronaviruses will be valuable for identifying and controlling future outbreaks. The coronavirus S protein plays a key role in host specificity by attaching the virus to receptors on the cell membrane. We analyzed 1238 spike sequences for their… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
85
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 79 publications
(85 citation statements)
references
References 30 publications
0
85
0
Order By: Relevance
“…Performing different data analytics tasks on sequences has been done successfully by different researchers previously [20,22]. However, most studies require the sequences to be aligned [10,23,24]. The aligned sequences are used to generate fixed length numerical embeddings, which can then be used for tasks such as classification and clustering [20,25,26].…”
Section: Literature Reviewmentioning
confidence: 99%
See 4 more Smart Citations
“…Performing different data analytics tasks on sequences has been done successfully by different researchers previously [20,22]. However, most studies require the sequences to be aligned [10,23,24]. The aligned sequences are used to generate fixed length numerical embeddings, which can then be used for tasks such as classification and clustering [20,25,26].…”
Section: Literature Reviewmentioning
confidence: 99%
“…Due to the availability of large-scale sequence data for the SARS-CoV-2 virus, an accurate and effective clustering method is needed to further analyze this disease, so as to better understand the dynamics and diversity of this virus. To classify different coronavirus hosts, authors in [10] suggest a one-hot encoding-based method that uses spike sequences alone. Their study reveals that they achieved excellent prediction accuracy considering just the spike portion of the genome sequence instead of using the entire sequence.…”
Section: Literature Reviewmentioning
confidence: 99%
See 3 more Smart Citations