High-speed data ingestion is critical in time-series workloads that are driven by the growth of Internet of Things (IoT) applications. We observe that traditional tree-based indexes encounter severe scalability bottlenecks for time-series workloads that insert monotonically increasing timestamp keys into an index; all insertions go to a small memory region that sees extremely high contention.
In this work, we present a new index design,
B
link
-hash, that enhances a tree-based index with hash leaf nodes to mitigate the contention of monotonic insertions --- insertions go to random locations within a hash node (which is much larger than a B+-tree node) to reduce conflicts. We develop further optimizations (median approximation and lazy split) to accelerate hash node splits. We also develop structure adaptation optimizations to dynamically convert a hash node to B+-tree nodes for good scan performance. Our evaluation shows that
B
link
-hash achieves up to 91.3× higher throughput than conventional indexes in a time-series workload that monotonically inserts timestamps into an index, while showing comparable scan performance to a well-optimized B+-tree.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.