2020
DOI: 10.1101/2020.12.06.413476
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

vHULK, a new tool for bacteriophage host prediction based on annotated genomic features and deep neural networks

Abstract: The experimental determination of a bacteriophage host is a laborious procedure. For this reason, there is a pressing need for reliable computational predictions of bacteriophage hosts in phage research in general and in phage therapy in particular. Here, we present a new program called vHULK for phage host prediction based on 9,504 phage genome features. These features take into account alignment significance scores between predicted-protein sequences in the phage genomes and a curated database of viral prote… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(19 citation statements)
references
References 42 publications
0
19
0
Order By: Relevance
“…In this section, we will show our experimental results on different datasets and compare CHERRY against the stateof-the-art tools: WIsH [10], PHP [24], VHM-Net [12], VPF-Class [13], vHULK [19], RaFAH [21], and HostG [27].…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…In this section, we will show our experimental results on different datasets and compare CHERRY against the stateof-the-art tools: WIsH [10], PHP [24], VHM-Net [12], VPF-Class [13], vHULK [19], RaFAH [21], and HostG [27].…”
Section: Resultsmentioning
confidence: 99%
“…The model will then calculate the likelihood of a prokaryote genome as the host for a query virus and assigns the host with the highest likelihood. vHULK [19] formulates host prediction as a multi-class classification problem where the inputs are viruses and the labels are the prokaryotes. The features used in their deep learning model is the protein profile alignment results against pVOGs database of phage protein families [20].…”
Section: Related Workmentioning
confidence: 99%
“…In both CRISPR searches, we required 100% matches between our contigs and spacer sequences. Lastly, we ran our contigs through the vHULK prediction tool (42). In only a few cases did these methods disagree, but when they did, we chose the host according to the order outlined above.…”
Section: Resultsmentioning
confidence: 99%
“…Only applicable to phages sharing at least 1 marker gene with a known phage reference. vHULK [34] Phage marker genes / HMM profiles Protein families (pVOGs) [33] VPF-Class [36] Phage marker genes / HMM profiles Protein families (VPF) [36] Alignment-free HostPhinder [39] Phage genomes Nucleotide 16-mer frequencies Independent of gene prediction and host reference database.…”
Section: Discussionmentioning
confidence: 99%
“…In contrast to methods leveraging sequence similarity between virus and candidate host genomes, another group of tools relies instead on a comparison between a query phage and a set of pre-defined phage marker genes (Table 1). In Viral Host UnveiLing Kit (vHULK), phage predicted protein sequences are affiliated to the Prokaryotic Virus Orthologous Group (pVOGs) database [33], and the pVOG list of reach query genome is used as input for two deep neural networks which provide a prediction of host species and genus, along with a measure for prediction confidence (i.e., entropy value) [34]. VPF-Class compared phage predicted proteins against a subset of Viral Protein Families (VPFs, [35]) and derive host prediction and confidence scores from the list of VPFs detected on each query genome, first at the host domain level and then to the family and genus levels, based on the distribution of these VPFs in reference phage genomes [36].…”
Section: I2 Approaches Based On Viral Marker Genesmentioning
confidence: 99%