Load balancing for partition-based similarity search

Tang, Xun; Alabduljalil, Maha; Jin, Xin; Yang, Tao

doi:10.1145/2600428.2609624

Cited by 8 publications

(6 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Traditional NNG construction methods could not scale to sets of object this large. Given the growing popularity of cloud computing, some of the traditional NNS methods were ported to cloud programming frameworks developed for dealing with big data (e.g., Hadoop, Spark) [1,2,14,18,31,38,39,43,46]. Most of the solutions use the MapReduce [20] framework and can be split into two categories.…”

Section: Related Workmentioning

confidence: 99%

“…The second category of MapReduce methods use a mapperonly scheme, with no reducers [1,2,43]. They partition the set of objects into subsets (blocks) and use serial APSS methods to find pairwise similarities of objects in block pairs.…”

Section: Related Workmentioning

confidence: 99%

“…They found that filtering candidates was detrimental to execution speed and suggested removing this optimization, rendering their local search identical to that performed in one tile by our naïve baseline, pIdxJoin. Within this context, they examined distributed load balancing strategies [43] and cache-conscious performance optimizations for the local searches [1]. They provided a cost based analysis aimed at finding sizes for comparison blocks that maximize cache locality.…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Parallel cosine nearest neighbor graph construction

Anastasiu

Karypis²

2019

Journal of Parallel and Distributed Computing

View full text Add to dashboard Cite

The nearest neighbor graph is an important structure in many data mining methods for clustering, advertising, recommender systems, and outlier detection. Constructing the graph requires computing up to n 2 similarities for a set of n objects. This high complexity has led researchers to seek approximate methods, which find many but not all of the nearest neighbors. In contrast, we leverage shared memory parallelism and recent advances in similarity joins to solve the problem exactly. Our method considers all pairs of potential neighbors but quickly filters pairs that could not be a part of the nearest neighbor graph, based on similarity upper bound estimates. The filtering is data dependent and not easily predicted, which poses load balance challenges in parallel execution. We evaluated our methods on several real-world datasets and found they work up to two orders of magnitude faster than existing methods, display linear strong scaling characteristics, and incur less than 1% load imbalance during filtering.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Parallel cosine nearest neighbor graph construction

Anastasiu

Karypis²

2019

Journal of Parallel and Distributed Computing

View full text Add to dashboard Cite

show abstract

“…The second category of MapReduce methods use a mapper-only scheme, with no reducers [1,2,22]. They partition the set of objects into subsets (blocks) and use serial APSS methods to find pairwise similarities of objects in block pairs.…”

Section: Related Workmentioning

confidence: 99%

“…However, these methods suffer from high communication costs which make them inefficient for large datasets [2]. Partition based MapReduce methods [1,2,22] address this problem via block data decomposition, using serial APSS methods on MapReduce nodes to compute pairwise similarities between objects in block pairs. These methods could further benefit from multi-core parallel APSS solutions, which are not prevalent in the literature.…”

Section: Introductionmentioning

confidence: 99%

Pl2ap

Anastasiu

Karypis

2015

Proceedings of the 5th Workshop on Irregular Applications: Architectures and Algorithms

View full text Add to dashboard Cite

Solving the AllPairs similarity search problem entails finding all pairs of vectors in a high dimensional sparse dataset that have a similarity value higher than a given threshold. The output form this problem is a crucial component in many real-world applications, such as clustering, online advertising, recommender systems, near-duplicate document detection, and query refinement. A number of serial algorithms have been proposed that solve the problem by pruning many of the possible similarity candidates for each query object, after accessing only a few of their non-zero values. The pruning process results in unpredictable memory access patterns that can reduce search efficiency. In this context, we introduce pL2AP, which efficiently solves the AllPairs cosine similarity search problem in a multi-core environment. Our method uses a number of cache-tiling optimizations, combined with fine-grained dynamically balanced parallel tasks, to solve the problem 1.5x-238x faster than existing parallel baselines on datasets with hundreds of millions of non-zeros.

show abstract