“…Barbay et al's result holds even when we are given a partition of a permutation into ρ increasing or decreasing subsequences, and some authors [1,10] have found that using both increasing and decreasing subsequences often improves compression in practice. Computing a partition into the minimum number of such subsequences is NP-hard, however, and we see no reason why SA −1 •SSA should contain long decreasing subsequences.…”
Section: New Directionsmentioning
confidence: 98%
“…For our example, we can partition both SSA and SA into [4,6,9,2], for T ′ i = a; [7, 0], for [8,1], for T ′ i = r; and [10], for T ′ i = ǫ. In this particular case, however, we could just as well partition both SSA and SA into only two common subsequences: e.g., [10,7,0] and [3,5,8,1,4,6,9,2].…”
Section: Theorymentioning
confidence: 99%
“…For example, if π[0..12] = [12,11,6,3,8,0,5,7,4,9,1,10,2] π[0..11] = [11,10,7,0,3,5,8,1,4,6,9,2] then the elements in their subsequences [12,11,8,0,9,1,10,2] and [11,10,7,0,8,1,9,2]…”
Section: New Directionsmentioning
confidence: 99%
“…, for T ′ i = r; and [10], for T ′ i = ǫ. In this particular case, however, we could just as well partition both SSA and SA into only two common subsequences: e.g., [10,7,0] and [3, 5, 8, 1, 4, 6, 9, 2].…”
Section: Theorymentioning
confidence: 99%
“…Supowit [28] gave a simple algorithm that partitions SA −1 • SSA into ρ increasing subsequences in O(n lg ρ) ⊆ O(n min(w, ℓ − w) lg σ) time. When applied to SA −1 • SSA in our example, Supowit's algorithm partitions it into [0, 3,4] and [1,2,5,6,7,8,9,10].…”
Abstract. Spaced seeds are important tools for similarity search in bioinformatics, and using several seeds together often significantly improves their performance. With existing approaches, however, for each seed we keep a separate linear-size data structure, either a hash table or a spaced suffix array (SSA). In this paper we show how to compress SSAs relative to normal suffix arrays (SAs) and still support fast random access to them. We first prove a theoretical upper bound on the space needed to store an SSA when we already have the SA. We then present experiments indicating that our approach works even better in practice.
“…Barbay et al's result holds even when we are given a partition of a permutation into ρ increasing or decreasing subsequences, and some authors [1,10] have found that using both increasing and decreasing subsequences often improves compression in practice. Computing a partition into the minimum number of such subsequences is NP-hard, however, and we see no reason why SA −1 •SSA should contain long decreasing subsequences.…”
Section: New Directionsmentioning
confidence: 98%
“…For our example, we can partition both SSA and SA into [4,6,9,2], for T ′ i = a; [7, 0], for [8,1], for T ′ i = r; and [10], for T ′ i = ǫ. In this particular case, however, we could just as well partition both SSA and SA into only two common subsequences: e.g., [10,7,0] and [3,5,8,1,4,6,9,2].…”
Section: Theorymentioning
confidence: 99%
“…For example, if π[0..12] = [12,11,6,3,8,0,5,7,4,9,1,10,2] π[0..11] = [11,10,7,0,3,5,8,1,4,6,9,2] then the elements in their subsequences [12,11,8,0,9,1,10,2] and [11,10,7,0,8,1,9,2]…”
Section: New Directionsmentioning
confidence: 99%
“…, for T ′ i = r; and [10], for T ′ i = ǫ. In this particular case, however, we could just as well partition both SSA and SA into only two common subsequences: e.g., [10,7,0] and [3, 5, 8, 1, 4, 6, 9, 2].…”
Section: Theorymentioning
confidence: 99%
“…Supowit [28] gave a simple algorithm that partitions SA −1 • SSA into ρ increasing subsequences in O(n lg ρ) ⊆ O(n min(w, ℓ − w) lg σ) time. When applied to SA −1 • SSA in our example, Supowit's algorithm partitions it into [0, 3,4] and [1,2,5,6,7,8,9,10].…”
Abstract. Spaced seeds are important tools for similarity search in bioinformatics, and using several seeds together often significantly improves their performance. With existing approaches, however, for each seed we keep a separate linear-size data structure, either a hash table or a spaced suffix array (SSA). In this paper we show how to compress SSAs relative to normal suffix arrays (SAs) and still support fast random access to them. We first prove a theoretical upper bound on the space needed to store an SSA when we already have the SA. We then present experiments indicating that our approach works even better in practice.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.