2022
DOI: 10.48550/arxiv.2201.02273
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

PWM2Vec: An Efficient Embedding Approach for Viral Host Specification from Coronavirus Spike Sequences

Abstract: The origin of SARS-CoV-2 in humans, which led to the COVID-19 pandemic, is still unknown and is an important open question. There are speculations that bats are a possible origin. Likewise, there are many closely related (corona-) viruses, such as SARS, which was found to be transmitted through civets. The study of the different hosts which can be potential carriers and transmitters of deadly viruses to humans is crucial to understanding, mitigating and preventing current and future pandemics. In coronaviruses… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 63 publications
(77 reference statements)
0
2
0
Order By: Relevance
“…Using the reference genome employed by GISAID (EPI ISL 402124) 3 and the list of marker variants 4 for each GISAID clade with respect to this reference, we evaluated how many k-mers among the 50 chosen ones by Saliency Maps and Shap Values actually matched any of the reported marker variants. A summary is shown in Table 7.…”
Section: Matching Relevant K-mers To Mutationsmentioning
confidence: 99%
See 1 more Smart Citation
“…Using the reference genome employed by GISAID (EPI ISL 402124) 3 and the list of marker variants 4 for each GISAID clade with respect to this reference, we evaluated how many k-mers among the 50 chosen ones by Saliency Maps and Shap Values actually matched any of the reported marker variants. A summary is shown in Table 7.…”
Section: Matching Relevant K-mers To Mutationsmentioning
confidence: 99%
“…Fast and efficient solutions to the clade assignment problem would help in tracking current and evolving strains and it is crucial for the surveillance of the pathogen. This classification problem has been attacked with machine learning approaches [3,4,5] using the Spike protein amino acid sequence to drive the classification step.…”
Section: Introductionmentioning
confidence: 99%