2020
DOI: 10.3390/a14010005
|View full text |Cite
|
Sign up to set email alerts
|

Re-Pair in Small Space

Abstract: Re-Pairis a grammar compression scheme with favorably good compression rates. The computation of Re-Pair comes with the cost of maintaining large frequency tables, which makes it hard to compute Re-Pair on large-scale data sets. As a solution for this problem, we present, given a text of length n whose characters are drawn from an integer alphabet with size σ=nO(1), an O(min(n2,n2lglogτnlglglgn/logτn)) time algorithm computing Re-Pair with max((n/c)lgn,nlgτ)+O(lgn) bits of working space including the text spac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 40 publications
0
2
0
Order By: Relevance
“…Its main drawback is that most implementations need Θ(n) space [11], and hence are not applicable on massive datasets. The only implementation using o(n) space is [22], but it is not practical. There is also work on running Re-Pair on the compressed input [30], but since it already requires the text as a grammar, it is not applicable in our case.…”
Section: Resultsmentioning
confidence: 99%
“…Its main drawback is that most implementations need Θ(n) space [11], and hence are not applicable on massive datasets. The only implementation using o(n) space is [22], but it is not practical. There is also work on running Re-Pair on the compressed input [30], but since it already requires the text as a grammar, it is not applicable in our case.…”
Section: Resultsmentioning
confidence: 99%
“…On the other hand, RePair is known to achieve the best compression ratio on many real-world datasets and enjoy applications in web graph compression [10] and XML compression [31]. Some variants of RePair have also been proposed [32,6,17,15,14,28].…”
Section: Related Workmentioning
confidence: 99%