2019 Data Compression Conference (DCC) 2019
DOI: 10.1109/dcc.2019.00020
|View full text |Cite
|
Sign up to set email alerts
|

Tunneling on Wheeler Graphs

Abstract: The Burrows-Wheeler Transform (BWT) is an important technique both in data compression and in the design of compact indexing data structures. It has been generalized from single strings to collections of strings and some classes of labeled directed graphs, such as tries and de Bruijn graphs. The BWTs of repetitive datasets are often compressible using run-length compression, but recently Baier (CPM 2018) described how they could be even further compressed using an idea he called tunneling. In this paper we sho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
14
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(14 citation statements)
references
References 12 publications
0
14
0
Order By: Relevance
“…Despite being well-suited for compressing repetitive tree topologies, these representations also are not (yet) able to support efficient indexing queries on topologies more complex than paths so we do not discuss them further. The recent tunneling [48] and run-length XBW Transform [49] compression schemes are the first compressed representations for repetitive topologies supporting count queries (the latter technique supports also locate queries, but it requires additional linear space). These techniques can be applied to Wheeler graphs [50], as well, and are covered more in detail in Sections 3.5 and 3.7.…”
Section: Graph Compressionmentioning
confidence: 99%
See 2 more Smart Citations
“…Despite being well-suited for compressing repetitive tree topologies, these representations also are not (yet) able to support efficient indexing queries on topologies more complex than paths so we do not discuss them further. The recent tunneling [48] and run-length XBW Transform [49] compression schemes are the first compressed representations for repetitive topologies supporting count queries (the latter technique supports also locate queries, but it requires additional linear space). These techniques can be applied to Wheeler graphs [50], as well, and are covered more in detail in Sections 3.5 and 3.7.…”
Section: Graph Compressionmentioning
confidence: 99%
“…Another method to compress the XBWT is to exploit repetitions of isomorphic subtrees. Alanko et al [48] show how this can be achieved by a technique they call tunneling and that consists in collapsing isomorphic subtrees that are adjacent in co-lexicographic order. Tunneling works for a more general class of graphs (Wheeler graphs), so we discuss it more in detail in Section 3.7.…”
Section: Compressionmentioning
confidence: 99%
See 1 more Smart Citation
“…It will be shown that each edge-reduced de Bruijn graph corresponds to a tunneled BWT of the underlying string, where the number of edges in the graph is identical to the length of the tunneled BWT. Therefore, we show that solving the edge minimization problem provides significant progress towards a solution to the open problem of finding the optimal disjoint blocks that minimize space, as stated in [8].…”
Section: Introductionmentioning
confidence: 96%
“…Starting from an underlying order of the alphabet, the order on the states (formally given in Definition 1) must: i) agree with the order of the labels of their incoming edges, and ii) be coherent on target/source nodes, for pairs of arcs with equal labels. It turns out that this kind of automata, called Wheeler automata, (a) admit an efficient index data structure for searching subpaths labeled with a given query pattern, and (b) enable a representation of the graph in a space proportional to that of the edges' labels since the topology can be encoded with just O(1) bits per node [13] (as well as enabling more advanced compression mechanisms, see [3,11]). This is in contrast with the fact that general graphs require a logarithmic (in the graph's size) number of bits per edge to be represented, as well as with recent results showing that in general, the subpath search problem can not be solved in subquadratic time, unless the strong exponential time hypothesis is false [4,5,6,7,10].…”
Section: Introductionmentioning
confidence: 99%