2009
DOI: 10.1145/1498698.1594228
|View full text |Cite
|
Sign up to set email alerts
|

Engineering a compressed suffix tree implementation

Abstract: Suffix tree is one of the most important data structures in string algorithms and biological sequence analysis. Unfortunately, when it comes to implementing those algorithms and applying them to real genomic sequences, often the main memory size becomes the bottleneck. This is easily explained by the fact that while a DNA sequence of length n from alphabet Σ = { A , C , G , T } can be stored … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
17
0

Year Published

2010
2010
2017
2017

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 21 publications
(17 citation statements)
references
References 36 publications
(39 reference statements)
0
17
0
Order By: Relevance
“…A natural alternative to this approach is to use specialized compressed data structures which allow specific operations to be performed on compressed data. This study of compressed or succinct data structures is advancing rapidly; for example, both Grossi et al and Välimäki et al show how compressed suffix trees help in practice for real-world problems by reducing memory use [26,49]. While we believe this is a promising approach, to our knowledge, no compressed data structure provides a suitable interface to implement a generic GroupBy-Aggregate operation.…”
Section: Compression For Memory Efficiencymentioning
confidence: 92%
See 1 more Smart Citation
“…A natural alternative to this approach is to use specialized compressed data structures which allow specific operations to be performed on compressed data. This study of compressed or succinct data structures is advancing rapidly; for example, both Grossi et al and Välimäki et al show how compressed suffix trees help in practice for real-world problems by reducing memory use [26,49]. While we believe this is a promising approach, to our knowledge, no compressed data structure provides a suitable interface to implement a generic GroupBy-Aggregate operation.…”
Section: Compression For Memory Efficiencymentioning
confidence: 92%
“…DRAM is expensive and an expensive consumer of power [29]. Memory accesses are a common bottleneck for high-performance applications [49]. With the number of cores per socket growing faster than the memory Table 1: Amazon EC2 proportional resource costs (# resource units × per-hour unit resource cost); the per-hour unit resource costs are 1.51¢ (1 Elastic Compute Unit), 1.93¢ (1GB RAM) and 0.018¢ (1GB storage); analysis detailed in §5.3.…”
Section: Introductionmentioning
confidence: 99%
“…We compare the following CST implementations: Välimäki et al's [20] implementation of Sadakane's compressed suffix tree [11] (CST-Sadakane); Russo's implementation of Russo et al's "fully-compressed" suffix tree [14] (FCST); and our best variants. These are called Our CST in the plots.…”
Section: Comparing the Cst Implementationsmentioning
confidence: 99%
“…The solution based on explicit topology was implemented by Välimäki et al [20]. As expected from theory, the structure is very fast, achieving a few tens of microseconds per operation, but uses significant space (about 25-35 bpc, close to a suffix array).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation