2020
DOI: 10.1101/2020.03.26.009001
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CHEER: hierarCHical taxonomic classification for viral mEtagEnomic data via deep leaRning

Abstract: ABSTRARCTThe fast accumulation of viral metagenomic data has contributed significantly to new RNA virus discovery. However, the short read size, complex composition, and large data size can all make taxonomic analysis difficult. In particular, commonly used alignment-based methods are not ideal choices for detecting new viral species. In this work, we present a novel hierarchical classification model named CHEER, which can conduct read-level taxonomic classification from order to genus for new species. By comb… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2
2

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 28 publications
0
3
0
Order By: Relevance
“…There are two major methods for the embedding layer: one-hot embedding and skip-gram embedding ( Mikolov et al, 2013 ). As shown in our previous work of using CNN for classifying RNA viruses ( Shang and Sun, 2020 ), the skip-gram-based embedding can improve CNN’s learning ability. Thus, in this work, we implemented a skip-gram embedding layer that can map proximate k-mers into highly similar vectors.…”
Section: Methodsmentioning
confidence: 77%
See 1 more Smart Citation
“…There are two major methods for the embedding layer: one-hot embedding and skip-gram embedding ( Mikolov et al, 2013 ). As shown in our previous work of using CNN for classifying RNA viruses ( Shang and Sun, 2020 ), the skip-gram-based embedding can improve CNN’s learning ability. Thus, in this work, we implemented a skip-gram embedding layer that can map proximate k-mers into highly similar vectors.…”
Section: Methodsmentioning
confidence: 77%
“…There are a number of learning-based tools for microbe classification such as the Naïve Bayes classifier ( Wang et al, 2007 ) and CNN ( Shang and Sun, 2020 ). They use either manually derived or automatically learned features to predict taxonomic labels for bacteria or RNA viruses.…”
Section: Introductionmentioning
confidence: 99%
“…Homology based methods require higher computational resources and might produce unreliable alignment for novel viral species (Bazinet and Cummings, 2012). Supervised machine learning methods have been widely used to classify metagenomic reads against known viral and bacterial genomes (Ounit et al , 2015; Ounit and Lonardi, 2016; Shang and Sun, 2020). Machine learning techniques have also been used to assign taxonomic labels of viruses from genome sequence in CASTOR and ML-DSP (Remita et al , 2017; Randhawa et al , 2020, 2019).…”
Section: Introductionmentioning
confidence: 99%