Proceedings DCC 2002. Data Compression Conference
DOI: 10.1109/dcc.2002.999972
|View full text |Cite
|
Sign up to set email alerts
|

Index compression through document reordering

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
89
0

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 73 publications
(89 citation statements)
references
References 7 publications
0
89
0
Order By: Relevance
“…5 We can represent f using nH 0 (f ) + o(n)(H 0 (f ) + 1) bits so that any f (i) can be computed in O (1) time. Moreover, each element of f −1 (a) can be computed in O (lg lg σ) time, and |f −1 (a)| requires time O (lg lg σ lg lg lg σ).…”
Section: Compressing Functions and Dynamic Collections Of Disjoint Setsmentioning
confidence: 99%
“…5 We can represent f using nH 0 (f ) + o(n)(H 0 (f ) + 1) bits so that any f (i) can be computed in O (1) time. Moreover, each element of f −1 (a) can be computed in O (lg lg σ) time, and |f −1 (a)| requires time O (lg lg σ lg lg lg σ).…”
Section: Compressing Functions and Dynamic Collections Of Disjoint Setsmentioning
confidence: 99%
“…Beyond social sciences, data ordering has been popular in gene expression data analysis in bioinformatics [13], [14] and analysis of geographical data [34]. It is also important in bandwidth minimization [35] and data compression [36], [37].…”
Section: B Data Reorderingmentioning
confidence: 99%
“…Shieh et al [9] and Blandford and Blelloch [2] built a full document similarity graph and traversed it by different algorithms such as the TSP and recursive splitting. Silvestri et al [10] used an on the fly assignment technique with temporal and spacial complexity linear or superlinear on the number of documents, but also dependant on the average document length.…”
Section: Implementation Considerationsmentioning
confidence: 99%
“…This process is done after the collection is traversed and the inverted file is built. These works are presented in [2] and [9], in the following of this paper, the B&B (Blandford and Blelloch) and the TSP approach (Travelling Salesman Problem) respectively. Results show that the document identifier reassignment technique is effective in lowering the average d-gap, and therefore allowing gains in compression ratios.…”
Section: Introductionmentioning
confidence: 99%