Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096) 1999
DOI: 10.1109/dcc.1999.755678
|View full text |Cite
|
Sign up to set email alerts
|

Data compression using long common strings

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
17
0

Year Published

2000
2000
2022
2022

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 44 publications
(17 citation statements)
references
References 11 publications
0
17
0
Order By: Relevance
“…Most of these offline algorithms proceed in a greedy manner, selecting in each iteration one repeated word w according to a score function and replacing all the (non-overlapping) occurrences of the repeat w in the whole grammar by a new terminal N and adding the new production N → w to the grammar. Different heuristics have been used to choose the repeat: the most frequent one [7], the longest [8] and the one that reduces the most the size of the resulting grammar (COMPRESSIVE [9]). GREEDY [10] belongs to this last family but the score used for choosing the words is oriented toward directly optimizing the number of bits needed to encode the grammar rather than minimizing its size.…”
Section: Introductionmentioning
confidence: 99%
“…Most of these offline algorithms proceed in a greedy manner, selecting in each iteration one repeated word w according to a score function and replacing all the (non-overlapping) occurrences of the repeat w in the whole grammar by a new terminal N and adding the new production N → w to the grammar. Different heuristics have been used to choose the repeat: the most frequent one [7], the longest [8] and the one that reduces the most the size of the resulting grammar (COMPRESSIVE [9]). GREEDY [10] belongs to this last family but the score used for choosing the words is oriented toward directly optimizing the number of bits needed to encode the grammar rather than minimizing its size.…”
Section: Introductionmentioning
confidence: 99%
“…edu/homes/stelo/Off-line/. The file is artificially obtained by concatenating with itself, in an attempt to probe into extreme cases of intersequence correlation [21]. The last two families (8 and Table 5 Comparing OFF-LINE with Other Compression Programs on the Chromosomes of the Yeast 9) are a segment of all the upstream regions of the yeast and, thus, not strongly related.…”
Section: Resultsmentioning
confidence: 99%
“…On such inputs, the approach presented here yields scores that are not only better than those of any other method, but also improve increasingly with increasing input size. This is to be attributed to a certain ability to capture distant relationships among the sequences in a family, a feature the merits of which were dramatically exposed in the recent paper [21].…”
Section: Introductionmentioning
confidence: 99%
“…Its implementations include Xdelta and open-vcdiff. Both of them divide the old version into chunks, and use the dictionary for the chunk fingerprints [9]. As a more popular tool, Xdelta optimizes the generated instructions and prioritizes speed over compression ratio [10].…”
Section: A Incremental Update Methodsmentioning
confidence: 99%