2016
DOI: 10.1186/s13062-016-0163-0
|View full text |Cite
|
Sign up to set email alerts
|

xHMMER3x2: Utilizing HMMER3’s speed and HMMER2’s sensitivity and specificity in the glocal alignment mode for improved large-scale protein domain annotation

Abstract: BackgroundWhile the local-mode HMMER3 is notable for its massive speed improvement, the slower glocal-mode HMMER2 is more exact for domain annotation by enforcing full domain-to-sequence alignments. Since a unit of domain necessarily implies a unit of function, local-mode HMMER3 alone remains insufficient for precise function annotation tasks. In addition, the incomparable E-values for the same domain model by different HMMER builds create difficulty when checking for domain annotation consistency on a large-s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 25 publications
0
4
0
Order By: Relevance
“…org) and the Phytozome Genome Data Resource Library (https:// genome.jgi.doe.gov/portal). The first batch of candidate genes containing PF00170 and PF07716 domains was identified (E-value=e -10 ) using HMMER v.3.2 (Yap et al, 2016).…”
Section: Identification Of Pbbzip Family Membersmentioning
confidence: 99%
“…org) and the Phytozome Genome Data Resource Library (https:// genome.jgi.doe.gov/portal). The first batch of candidate genes containing PF00170 and PF07716 domains was identified (E-value=e -10 ) using HMMER v.3.2 (Yap et al, 2016).…”
Section: Identification Of Pbbzip Family Membersmentioning
confidence: 99%
“…The PF00487 domain model files of PbFAD family members were downloaded from the PFAM website ( https://www.pfam.org ). The candidate genes containing PF00487 domains were identified (E-value=e −10 ) using HMMER v.3.2 (Yap et al, 2016 ). The FAD protein sequences of white pear and Arabidopsis were extracted and aligned.…”
Section: Methodsmentioning
confidence: 99%
“…MAFFT used default parameters to align the multiple homologous FAD genes. A phylogenetic tree was constructed using the maximum likelihood method [(bootstraps = 1,000) and IQ-TREE 1.6.9 sofware (Yap et al, 2016 )].…”
Section: Methodsmentioning
confidence: 99%
“…To finally determine if a transcript is a lncRNA, four popular methods for coding potential analysis were applied: (1) CPC (Coding-Potential Calculator) [ 64 ] computes the coding potential of a transcript by matching it to the NCBI nr database using BLASTX and scoring it using a support vector machine, (2) CNCI (Coding-Non-Coding Index) distinguishes protein-coding and noncoding transcripts independent of known annotations and predicts the coding or noncoding potential based solely on the features of nucleotide triplets, (3) transcripts were translated into proteins and matched to known protein domains in Pfam [ 65 ] using HMMER3 [ 66 ] where a matched sequence is considered as having coding potential, whereas others are considered as noncoding, and (4) PhyloCSF (Phylogenetic Codon Substitution Frequency) uses genome-wide mammalian sequence alignments to calculate the coding potential of transcripts.…”
Section: Methodsmentioning
confidence: 99%