2020
DOI: 10.1016/j.tcs.2019.11.001
|View full text |Cite
|
Sign up to set email alerts
|

Lightweight merging of compressed indices based on BWT variants

Abstract: In this paper we propose a flexible and lightweight technique for merging compressed indices based on variants of Burrows-Wheeler transform (BWT), thus addressing the need for algorithms that compute compressed indices over large collections using a limited amount of working memory. Merge procedures make it possible to use an incremental strategy for building large indices based on merging indices for progressively larger subcollections.Starting with a known lightweight algorithm for merging BWTs [Holt and McM… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
1
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 39 publications
(74 reference statements)
0
4
0
Order By: Relevance
“…In this variant, the output consists of the last characters of the lexicographically sorted cyclic rotations of all factors of the Lyndon Factorization [16] of s. This variant has been recently shown in [3] to satisfy 1, but being based on the cyclic rotations of the Lyndon factors properties 2 and 3 have not been studied. Note that the related variant Extended BWT [46], which takes as input a collection of strings, does satisfy property 3 for the problem of circular pattern search [24,29,40,39,10,11].…”
Section: Known Bwt Variantsmentioning
confidence: 99%
“…In this variant, the output consists of the last characters of the lexicographically sorted cyclic rotations of all factors of the Lyndon Factorization [16] of s. This variant has been recently shown in [3] to satisfy 1, but being based on the cyclic rotations of the Lyndon factors properties 2 and 3 have not been studied. Note that the related variant Extended BWT [46], which takes as input a collection of strings, does satisfy property 3 for the problem of circular pattern search [24,29,40,39,10,11].…”
Section: Known Bwt Variantsmentioning
confidence: 99%
“…Merging two de Bruijn graphs G 0 and G 1 , or other succinct indices [19], amounts to building a new succinct data structure that supports the retrieval of the elements which are in G 0 or in G 1 . Because of the correspondence between succinct data structures and Wheeler automata, the natural generalization of the problem of merging succinct indices is the problem of computing a Wheeler automaton recognizing the union language L = L 0 ∪ L 1 , given the two Wheeler automata A 0 = (V 0 , E 0 , F 0 , s 0 , Σ) and…”
Section: Merging Wheeler Graphs Via 2-satmentioning
confidence: 99%
“…Notice that the working space of our algorithm is always less than the space of the resulting succinct de Bruijn graph. Our new merging algorithm is based on a mixed LSD/MSD Radix Sort algorithm which is inspired by the lightweight BWT merging introduced by Holt and McMillan [28,29] and later improved in [18,19]. In addition to its small working space, our algorithm has the remarkable feature that it can compute as a by-product, with no additional cost, the LCS (Longest Common Suffix) between the node labels in Bowe et al's representation, thus making it possible to construct succinct Variable Order de Bruijn graph [10], a feature not shared by any other merging algorithm.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation