2005
DOI: 10.1007/11496656_13
|View full text |Cite
|
Sign up to set email alerts
|

Hardness of Optimal Spaced Seed Design

Abstract: Speeding up approximate pattern matching is a line of research in stringology since the 80's. Practically fast approaches belong to the class of filtration algorithms, in which text regions dissimilar to the pattern are first excluded, and the remaining regions are then compared to the pattern by dynamic programming. Among the conditions used to test similarity between the regions and the pattern, many require a minimum number of common substrings between them. When only substitutions are taken into account fo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2007
2007
2010
2010

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 23 publications
0
8
0
Order By: Relevance
“…For instance, a set of 4 spaced seeds of weight 13 was manually designed to search for 33 bp tags [20]. Hence, adaptation of ZOOM to a new setup re-quires the design of specific seeds, which is a theoretically hard and practically difficult problem [25,26,27]. The present limitation of ZOOM to patterns up to 64 bp is certainly due to this bottleneck.…”
Section: Discussionmentioning
confidence: 99%
“…For instance, a set of 4 spaced seeds of weight 13 was manually designed to search for 33 bp tags [20]. Hence, adaptation of ZOOM to a new setup re-quires the design of specific seeds, which is a theoretically hard and practically difficult problem [25,26,27]. The present limitation of ZOOM to patterns up to 64 bp is certainly due to this bottleneck.…”
Section: Discussionmentioning
confidence: 99%
“…Hence, sets of spaced seeds are specifically designed for a certain read/match length and a maximum number of allowed differences, and different sets corresponding to different parameter combinations are hard coded in ZOOM. All known formulations of the seed design problem are at least NP-hard, even for a single seed [35,42,47].…”
Section: Other Solutions For Mapping Readsmentioning
confidence: 99%
“…For a given spaced seed, computing the probability that it generates a hit in a homology region is NP-hard in the Bernoulli model [28], as well as in the uniform model [29].…”
Section: Terminologiesmentioning
confidence: 99%
“…Then the relaxed seed is equivalent to the spaced seed S ∪ S . Computing the sensitivity is NP-hard in both models [28,29].…”
Section: Theoremmentioning
confidence: 99%