2022
DOI: 10.3389/fmicb.2022.1061122
|View full text |Cite
|
Sign up to set email alerts
|

iProm-phage: A two-layer model to identify phage promoters and their types using a convolutional neural network

Abstract: The increased interest in phages as antibacterial agents has resulted in a rise in the number of sequenced phage genomes, necessitating the development of user-friendly bioinformatics tools for genome annotation. A promoter is a DNA sequence that is used in the annotation of phage genomes. In this study we proposed a two layer model called “iProm-phage” for the prediction and classification of phage promoters. Model first layer identify query sequence as promoter or non-promoter and if the query sequence is pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 24 publications
0
5
0
Order By: Relevance
“…Existing promoter prediction algorithms are primarily trained based on known promoter sequences, and their predictive ability is accordingly limited for novel test enzymes with significantly different sequences and structures. [ 24,35 ] Indeed, given the paucity of characterized RNAP promoters, novel “test” enzymes may recognize sequence motifs and architectures that are (i) substantially different to those used to train/design current algorithms, and accordingly (ii) beyond the predictive capabilities of available tools. Rationalizing that divergence in promoter structure/sequence would be underpinned by differences at the amino acid level, we investigated whether enzyme inactivity was associated with DNA binding domain sequences that varied significantly to those of well‐studied biocatalysts.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Existing promoter prediction algorithms are primarily trained based on known promoter sequences, and their predictive ability is accordingly limited for novel test enzymes with significantly different sequences and structures. [ 24,35 ] Indeed, given the paucity of characterized RNAP promoters, novel “test” enzymes may recognize sequence motifs and architectures that are (i) substantially different to those used to train/design current algorithms, and accordingly (ii) beyond the predictive capabilities of available tools. Rationalizing that divergence in promoter structure/sequence would be underpinned by differences at the amino acid level, we investigated whether enzyme inactivity was associated with DNA binding domain sequences that varied significantly to those of well‐studied biocatalysts.…”
Section: Resultsmentioning
confidence: 99%
“…Existing promoter prediction algorithms are primarily trained based on known promoter sequences, and their predictive ability is accordingly limited for novel test enzymes with significantly different sequences and structures. [24,35] Indeed, given the paucity of characterized RNAP promoters, novel "test"…”
Section: Cognate Promoter Prediction Is the Critical Limiting Factor ...mentioning
confidence: 99%
“…AMG flags are then assigned, as described by Shaffer et al (2020) to highlight the metabolic potential of the genes and qualify confidence in the genes being viral. In addition to probability scoring cut-offs, promoter and terminator recognition are also helpful for determining the viral origin of potential AMGs (Shujaat et al, 2022). After identifying the AMGs using the methods described above, there is still the possibility of erroneous functional annotation based solely on sequence similarity searches.…”
Section: Heavy Metal/metalloid Resistance and Detoxificationmentioning
confidence: 99%
“…Thus, this study utilized a one-hot feature encoding scheme. Several recent cutting edge bioinformatics techniques have used this technique e [ 40 , 41 , 42 , 43 ]. Representation of each nucleotide for A , C , G , and T , characterized as follows: …”
Section: Feature Encoding Schemementioning
confidence: 99%