2001
DOI: 10.1016/s0020-0255(01)00097-4
|View full text |Cite
|
Sign up to set email alerts
|

Data compression with long repeated strings

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
15
0

Year Published

2004
2004
2019
2019

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 20 publications
(16 citation statements)
references
References 9 publications
1
15
0
Order By: Relevance
“…Table 5 contains the results of the experiment similar to the experiment described by Bentley and McIlroy in Ref. [11]. It confirms that PPMII compresses much better than gzip.…”
Section: Complex Gain Functionsupporting
confidence: 69%
See 2 more Smart Citations
“…Table 5 contains the results of the experiment similar to the experiment described by Bentley and McIlroy in Ref. [11]. It confirms that PPMII compresses much better than gzip.…”
Section: Complex Gain Functionsupporting
confidence: 69%
“…In the paper of Bentley and McIlroy [11], we can find description of a very good algorithm, which has been created to find long repeated strings. This is a preprocessing algorithm, which can interact with many known compression algorithms.…”
Section: Finding Long Repeated Stringsmentioning
confidence: 99%
See 1 more Smart Citation
“…To encode large repositories, long repeated strings are identified and then delta encoding is applied [26]. As a practical industrial example, Google adopted such a technique [2] for handling of long repeated strings in the collection data of their Bigtable system [6]. RLZ can also be regarded as an application of string substitution, though the substitutions refer to an external dictionary rather than to previous parts of the collection itself.…”
Section: Collection Compression Methodsmentioning
confidence: 99%
“…Budget for RAM-resident dictionary F(C, D) Factorization of (collection) C against (dictionary) D: a sequence of factors R(C, D) The corresponding compression ratio of RLZ for (collection) C and (dictionary) D C 1 Concatenation of all 'small' factors from C2 that are picked by the CuD algorithm D 2 Dictionary sampled from C L 2 (for Figure 5) D 1 Dictionary from the initial tranche (baseline in Table 3) D o 2 Dictionary generated from C o 2 , which is large enough and will not be concatenated with D1 (baseline in Table 6) …”
mentioning
confidence: 99%