Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data 2020
DOI: 10.1145/3318464.3384706
|View full text |Cite
|
Sign up to set email alerts
|

CDFShop: Exploring and Optimizing Learned Index Structures

Abstract: Indexes are a critical component of data management applications. While tree-like structures (e.g., B-Trees) have been employed to great success, recent work suggests that index structures powered by machine learning models (learned index structures) can achieve low lookup times with a reduced memory footprint. This demonstration showcases CDFShop, a tool to explore and optimize recursive model indexes (RMIs), a type of learned index structure. This demonstration allows audience members to (1) gain an intuitio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
35
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
3
2
1

Relationship

2
7

Authors

Journals

citations
Cited by 46 publications
(36 citation statements)
references
References 4 publications
1
35
0
Order By: Relevance
“…RS achieves the lowest build times, due to its single-pass build phase. Note that the build time of PLEX already includes the autotuning time, unlike RS, CHT, or RMI [11], which were tuned offline via an expensive grid search. Our current implementation of CHT does not support key duplicates, which is the case for the wiki dataset.…”
Section: Discussionmentioning
confidence: 99%
“…RS achieves the lowest build times, due to its single-pass build phase. Note that the build time of PLEX already includes the autotuning time, unlike RS, CHT, or RMI [11], which were tuned offline via an expensive grid search. Our current implementation of CHT does not support key duplicates, which is the case for the wiki dataset.…”
Section: Discussionmentioning
confidence: 99%
“…The implementations of RMI, RadixSpline and ALEX are obtained from their open-source repositories [3,45,46]. The RMI hyper-parameters are tuned using CDFShop [40], an automatic RMI optimizer. RadixSpline is manually tuned by varying the error tolerance of the underlying models.…”
Section: Methodsmentioning
confidence: 99%
“…Note that the assumption of 𝑛 ≫ 𝑚 is valid as, in practice, the RMI optimizer [40] typically models a large input relation with complex distribution using RMI of 2 to 3 levels (excluding the root), and a fan-out 𝑚 of 1000. In the case of 2 levels, for example, the 𝑛 𝑚 ratio becomes 1000, which is relatively large.…”
Section: Buffered Grmi Inljmentioning
confidence: 99%
“…However, RMI does not guarantee an error bound for the keys that are not provided in the training phase. The original RMI work provides a solution that can be used in 2-layer RMI when all models within the RMI are monotonic (Kraska et al, 2018;Marcus et al, 2020;Rashelbach et al, 2020). However, it does not generalize for 3 layer RMI even if the models are monotonic.…”
Section: P-rmi: Partially-3-layer Rmimentioning
confidence: 99%