2020 Proceedings of the Twenty-Second Workshop on Algorithm Engineering and Experiments (ALENEX) 2020
DOI: 10.1137/1.9781611976007.14
|View full text |Cite
|
Sign up to set email alerts
|

RecSplit: Minimal Perfect Hashing via Recursive Splitting

Abstract: A minimal perfect hash function bijectively maps a key set out of a universe into the first | | natural numbers. Minimal perfect hash functions are used, for example, to map irregularly-shaped keys, such as strings, in a compact space so that metadata can then be simply stored in an array. While it is known that just 1.44 bits per key are necessary to store a minimal perfect hash function, no published technique can go below 2 bits per key in practice. We propose a new technique for storing minimal perfect has… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

1
21
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 18 publications
(22 citation statements)
references
References 14 publications
1
21
0
Order By: Relevance
“…In the experimental part of this work (Section 4), we show that when applied to the k-mer counting, the error of Count-Min may not be acceptable. The construction of a MPHF can be hyper-graph peeling-based [16,17] or array-based [18]. The first family of algorithms leads to smaller MPHFs, close to the theoretical space lower-bound of 1.44 bits per key, while array-based MPHFs are more cache friendly and much easier conceptually despite being less memory efficient than their mainstream counterparts.…”
Section: K-mer Spectrummentioning
confidence: 99%
See 1 more Smart Citation
“…In the experimental part of this work (Section 4), we show that when applied to the k-mer counting, the error of Count-Min may not be acceptable. The construction of a MPHF can be hyper-graph peeling-based [16,17] or array-based [18]. The first family of algorithms leads to smaller MPHFs, close to the theoretical space lower-bound of 1.44 bits per key, while array-based MPHFs are more cache friendly and much easier conceptually despite being less memory efficient than their mainstream counterparts.…”
Section: K-mer Spectrummentioning
confidence: 99%
“…The construction of MPHFs can be hyper-graph peeling-based [19,20] or array-based [21]. The first family of algorithms leads to smaller MPHFs, close to theoretical space lower-bound of 1.44 bits per key, while array-based MPHFs are conceptually simpler and have practical implementations for k-mer sets, such as BBHash [22].…”
Section: Minimal Perfect Hashingmentioning
confidence: 99%
“…Hence, recent efforts have been made to use minimal perfect hash functions (MPHFs) [10,18,26] for in-memory key-value lookups, which significantly reduce the space cost by avoiding storing keys. For a set of n key-value items where each item is a tuple (k i , v i ) of key k i and value v i , a minimal perfect hash function H ′ maps the n keys to integers 0 to n − 1 without collision.…”
mentioning
confidence: 99%
“…Step ii): For each group we find a hash function H such that H maps the four keys to integers 0 to 3 without collision. For most modern random hash function algorithms, we may generate an independent hash function H s by using a [18] Not allowed…”
mentioning
confidence: 99%
See 1 more Smart Citation