2015 IEEE 35th International Conference on Distributed Computing Systems 2015
DOI: 10.1109/icdcs.2015.53
|View full text |Cite
|
Sign up to set email alerts
|

Fast Compaction Algorithms for NoSQL Databases

Abstract: Abstract-Compaction plays a crucial role in NoSQL systems to ensure a high overall read throughput. In this work, we formally define compaction as an optimization problem that attempts to minimize disk I/O. We prove this problem to be NPHard. We then propose a set of algorithms and mathematically analyze upper bounds on worst-case cost. We evaluate the proposed algorithms on real-life workloads. Our results show that our algorithms incur low I/O costs and that a compaction approach using a balanced tree is mos… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
2
2
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 13 publications
0
7
0
Order By: Relevance
“…Several methods based on a log-structured file system have been developed [3], [17], [18], [20], [22], [23], [24], [31], [32], [34]. The LWC tree [10] offers metadata-based compaction without separating keys from values.…”
Section: Related Workmentioning
confidence: 99%
“…Several methods based on a log-structured file system have been developed [3], [17], [18], [20], [22], [23], [24], [31], [32], [34]. The LWC tree [10] offers metadata-based compaction without separating keys from values.…”
Section: Related Workmentioning
confidence: 99%
“…We remark that a problem similar to PLT training cost has been studied in the database literature. In particular, the problem of finding an optimal binary tree is proven to be NP-hard in (Ghosh et al, 2015). Note that the result of (Ghosh et al, 2015) does not imply hardness of the PLT training cost problem.…”
Section: Hardness Of Training Cost Minimizationmentioning
confidence: 99%
“…The problem of optimizing the training cost is closely related to the binary merging problem in databases (Ghosh et al, 2015). The hardness result in (Ghosh et al, 2015), however, does not generalize to our setting as it is limited to binary trees only. Nevertheless, our approximation result is partly based on the results from (Ghosh et al, 2015).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Optimized Multi-BRWT Construction. By analogy to the NoSQL table compaction problem [9], it can be shown that Multi-BRWT constrained on the space of binary trees with the uncompressed bit vector representation as the underlying structure for storing the index vectors is NP-hard. Thus, we propose a two-step approach for finding a good Multi-BRWT structure (see Figure 2).…”
Section: Multiary Topology-optimized Brwtsmentioning
confidence: 99%