2001
DOI: 10.1002/spe.394
|View full text |Cite
|
Sign up to set email alerts
|

Self‐adjusting trees in practice for large text collections

Abstract: Splay and randomised search trees are self-balancing binary tree structures with little or no space overhead compared to a standard binary search tree. Both trees are intended for use in applications where node accesses are skewed, for example in gathering the distinct words in a large text collection for index construction. We investigate the efficiency of these trees for such vocabulary accumulation. Surprisingly, unmodified splaying and randomised search trees are on average around 25% slower than using a s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2001
2001
2017
2017

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 29 publications
(12 citation statements)
references
References 25 publications
0
12
0
Order By: Relevance
“…Searches for rare or new strings are more costly, however, so the performance in practice depends on the distribution of words. In other work [7] we have found BSTs to be approximately as efficient in practice as other tree structures. Each node requires two pointers in addition to the stored string itself.…”
Section: Binary Search Treesmentioning
confidence: 81%
See 1 more Smart Citation
“…Searches for rare or new strings are more costly, however, so the performance in practice depends on the distribution of words. In other work [7] we have found BSTs to be approximately as efficient in practice as other tree structures. Each node requires two pointers in addition to the stored string itself.…”
Section: Binary Search Treesmentioning
confidence: 81%
“…Another drawback of splaying is the cost of reorganising the tree, with around three 2 comparisons and six assignments for each level. We have found that a practical heuristic that addresses this problem is to only rotate at every nth access, with say n = 11 [7].…”
Section: Binary Search Treesmentioning
confidence: 99%
“…The reader will observe that we have defined these operators in terms of various cases. This is, conceptually, similar to the zig-zig and zig-zag cases of the tree-based operations already introduced in the literature [11,13,20]. It is, of course, conceivable that we can include all the possible cases under a single umbrella, and then 'pick and choose' those which have to be used in each scenario, i.e.…”
Section: The Stl Operatormentioning
confidence: 95%
“…In this way, our heuristic often enables more efficient implementations involving less restructuring than splaying. Indeed, the amount of restructuring performed by splay trees is a limitation in the centralized setting as well, and has been addressed previously; e.g., variants like semi-splaying [26], randomized splaying [3,10] and periodic splaying [29], all attempt to reduce restructuring. We compare the restructuring costs of flattening versus splaying and its variants in our companion document [24].…”
Section: Related Workmentioning
confidence: 99%