We study the fundamental problem of approximate nearest neighbor search in d-dimensional Hamming space {0, 1}d . We study the complexity of the problem in the famous cell-probe model, a classic model for data structures. We consider algorithms in the cell-probe model with limited adaptivity, where the algorithm makes k rounds of parallel accesses to the data structure for a given k. For any k ≥ 1, we give a simple randomized algorithm solving the approximate nearest neighbor search using k rounds of parallel memory accesses, with O(k(log d) 1/k ) accesses in total. We also give a more sophisticated randomized algorithm using O(k + (O(1/k) ) memory accesses in k rounds for large enough k. Both algorithms use data structures of size polynomial in n, the number of points in the database.For the lower bound, we prove an Ω(lower bound for the total number of memory accesses required by any randomized algorithm solving the approximate nearest neighbor search within k ≤ log log d 2 log log log d rounds of parallel memory accesses on any data structures of polynomial size. This lower bound shows that our first algorithm is asymptotically optimal for any constant round k. And our second algorithm approaches the asymptotically optimal tradeoff between rounds and memory accesses, in a sense that the lower bound of memory accesses for any k 1 rounds can be matched by the algorithm within k 2 = O(k 1 ) rounds. In the extreme, for some large enough k = Θ log log d log log log d , our second algorithm matches the Θ log log d log log log d tight bound for fully adaptive algorithms for approximate nearest neighbor search due to Chakrabarti and Regev [10].
IntroductionNearest neighbor search is a fundamental theoretical problem in Computer Science, with enormously many applications in diverse fields. In the nearest neighbor search problem, we are given a database B of n points from a metric space X. The goal is to preprocess them into a data structure, such that given any query point x ∈ X, an algorithm with accessing to the data structure can find a database point in B that is closest to the query point x among all database points. An extensively studied case is when the metric space is the Hamming space X = {0, 1} d .It is conjectured that the nearest neighbor search is hard to solve by any data structures when the dimension d is high (e.g. d ≫ log n). This conjecture is sometimes referred as a case of the "curse of dimensionality" and is one of the central problems in the area of data structure lower