2009
DOI: 10.1093/bioinformatics/btp593
|View full text |Cite
|
Sign up to set email alerts
|

Significant speedup of database searches with HMMs by search space reduction with PSSM family models

Abstract: Motivation: Profile hidden Markov models (pHMMs) are currently the most popular modeling concept for protein families. They provide sensitive family descriptors, and sequence database searching with pHMMs has become a standard task in today's genome annotation pipelines. On the downside, searching with pHMMs is computationally expensive.Results: We propose a new method for efficient protein family classification and for speeding up database searches with pHMMs as is necessary for large-scale analysis scenarios… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
7

Relationship

2
5

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 42 publications
0
8
0
Order By: Relevance
“…Conserved sequence motifs were predicted by MEME (Bailey and Elkan, ) with a minimum size of 5 nucleotides, maximum size of 20 nucleotides, zero or one occurrence per sequence, using the promoter regions of eight GalA‐regulated genes as input. The occurrence of conserved motifs in the promoter regions of other genes was analysed with PoSSuM search (Beckstette et al ., ) by scanning for matches to the PSSMs describing the motif.…”
Section: Methodsmentioning
confidence: 99%
“…Conserved sequence motifs were predicted by MEME (Bailey and Elkan, ) with a minimum size of 5 nucleotides, maximum size of 20 nucleotides, zero or one occurrence per sequence, using the promoter regions of eight GalA‐regulated genes as input. The occurrence of conserved motifs in the promoter regions of other genes was analysed with PoSSuM search (Beckstette et al ., ) by scanning for matches to the PSSMs describing the motif.…”
Section: Methodsmentioning
confidence: 99%
“…We retrieved the gl_SOLEXA_5_FBgn0004618 binding site motif from Fly Factor Survey (Beckstette et al, 2009; Zhu et al, 2011). We used the motif to search the FlyBase r6.03 assembly for binding sites using PoSSumSearch2 with a p-value cutoff of 1E-04.…”
Section: Methodsmentioning
confidence: 99%
“…Locating reverse lcp-intervals can be accelerated by skp-tables. These tables, introduced in Beckstette et al [ 37 ] and hereinafter referred to as skp F and skp R , can be constructed in linear time [ 38 ] and allow one to quickly skip intervals in suf X (for details, see [ 37 ]).…”
Section: Methodsmentioning
confidence: 99%