Abstract. This paper presents moment analyses and characterizations of limit distributions for the construction cost of hash tables under the linear probing strategy. Two models are considered, that of full tables and that of sparse tables with a fixed filling ratio strictly smaller than one. For full tables, the construction cost has expectation O(n 3/2 ), the standard deviation is of the same order, and a limit law of the Airy type holds. (The Airy distribution is a semiclassical distribution that is defined in terms of the usual Airy functions or equivalently in terms of Bessel functions of indices − 1 3 , 2 3 .) For sparse tables, the construction cost has expectation O(n), standard deviation O( √ n), and a limit law of the Gaussian type. Combinatorial relations with other problems leading to Airy phenomena (like graph connectivity, tree inversions, tree path length, or area under excursions) are also briefly discussed.
We present an analysis of the effect of the last-come-first-served heuristic on a linear probing hash table. We study the behavior of successful searches, assuming searches for all elements of the table are equally likely. It is known that the Robin Hood heuristic achieves minimum variance over all linear probing algorithms. We show that the last-comefirst-served heuristic achieves this minimum up to lower-order terms. An accurate analysis of this algorithm is made by introducing a new transform which we call the Diagonal Poisson Transform as it resembles the Poisson Transform. We present important properties of this transform, as well as its application to solve some classes of recurrences, find inverse This work was done while the first two authors were at the University of Waterloo. Correspondence to: A. Viola
This paper studies the distribution of individual displacements for the standard and the Robin Hood linear probing hashing algorithms. When the a table of size m has n elements, the distribution of the search cost of a random element is studied for both algorithms. Specifically, exact distributions for fixed m and n are found as well as when the table is α-full, and α strictly smaller than 1. Moreover, for full tables, limit laws for both algorithms are derived.
We consider open addressing hashing, and implement it by using the Robin Hood strategy, that is, in case of collision, the element that has traveled the furthest can stay in the slot. We hash ∼ αn elements into a table of size n where each probe is independent and uniformly distributed over the table, and α < 1 is a constant. Let M n be the maximum search time for any of the elements in the table. We show that with probability tending to one, M n ∈ [log 2 log n + σ, log 2 log n + τ ] for some constants σ, τ depending upon α only. This is an exponential improvement over the maximum search time in case of the standard FCFS (first come first served) collision strategy, and virtually matches the performance of multiple choice hash methods.
Quickselect with median-of-3 is largely used in practice and its behavior is fairly well understood. However, the following natural adaptive variant, which we call proportion-from-3 , had not been previously analyzed: “choose as pivot the smallest of the sample if the relative rank of the sought element is below 1/3, the largest if the relative rank is above 2/3, and the median if the relative rank is between 1/3 and 2/3.” We first analyze the average number of comparisons made when using proportion-from-2 and then for proportion-from-3. We also analyze ν-find, a generalization of proportion-from-3 with interval breakpoints at ν and 1-ν. We show that there exists an optimal value of ν and we also provide the range of values of ν where ν-find outperforms median-of-3. Then, we consider the average total cost of these strategies, which takes into account the cost of both comparisons and exchanges. Our results strongly suggest that a suitable implementation of ν-find could be the method of choice in a practical setting. We also study the behavior of proportion-from- s with s >3 and in particular we show that proportion-from- s -like strategies are optimal when s →∞.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.