2020
DOI: 10.1101/2020.01.29.925354
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Interpretable detection of novel human viruses from genome sequencing data

Abstract: AbstractViruses evolve extremely quickly, so reliable methods for viral host prediction are necessary to safeguard biosecurity and biosafety alike. Novel human-infecting viruses are difficult to detect with standard bioinformatics workflows. Here, we predict whether a virus can infect humans directly from next-generation sequencing reads. We show that deep neural architectures significantly outperform both shallow machine learning and standard, homology-based algorithms, cuttin… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
63
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
2
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 27 publications
(67 citation statements)
references
References 81 publications
(116 reference statements)
4
63
0
Order By: Relevance
“…While pathogenicity prediction methods using whole genomes or protein sets as input also exist, this work focuses on read-based classification to offer real-time predictions and avoid delays necessitated by assembly pipelines. However, read-based methods have been shown to perform well and also for full genomes and assembled contigs, achieving similar or better performance than alignment-based approaches (Deneke et al, 2017;Bartoszewicz et al, 2020Bartoszewicz et al, , 2021.…”
Section: Read-based Detection Of Novel Pathogensmentioning
confidence: 99%
“…While pathogenicity prediction methods using whole genomes or protein sets as input also exist, this work focuses on read-based classification to offer real-time predictions and avoid delays necessitated by assembly pipelines. However, read-based methods have been shown to perform well and also for full genomes and assembled contigs, achieving similar or better performance than alignment-based approaches (Deneke et al, 2017;Bartoszewicz et al, 2020Bartoszewicz et al, , 2021.…”
Section: Read-based Detection Of Novel Pathogensmentioning
confidence: 99%
“…For the development of effective drugs, the systems demonstrated in [ 39 ] trained GAs and GANs, [ 41 ] used reinforcement learning techniques, and [ 43 ] applied LSTM networks. The human-infecting virus can be identified using deep learning-based architectures utilizing its next-generation sequence shown in [ 45 ].…”
Section: Discussionmentioning
confidence: 99%
“…It is difficult to detect human-infecting viruses using bioinformatics systems. Bartoszewicz et al [ 45 ] proposed an approach to predict whether a virus can infect the human-body directly utilizing next-generation sequence. The system showed that CNN and LSTM-based architecture outperformed the other machine learning algorithms and generalized to taxonomic units with a half error rate from those that are presented in the training phase.…”
Section: Deep Learning Applications For Covid-19mentioning
confidence: 99%
“…Shrikumar et al proposed an idea of reversecomplement network for genomic data, which can simultaneously read double-stranded DNA strands [10]. Based on the network structure, Jakub et al studied the detection method of the novel pathogenic DNA viruses [11], and proposed an interpretable learning method for detecting novel human viruses from genome sequencing data [12].…”
Section: B Feature Extractionmentioning
confidence: 99%