2019
DOI: 10.1002/cpe.5474
|View full text |Cite
|
Sign up to set email alerts
|

Performance evaluation of single vs. batch of queries on GPUs

Abstract: Summary The WAND processing strategy is a dynamic pruning algorithm designed for large scale Web search engines where fast response to queries is a critical service. The WAND is used to reduce the amount of computation by scoring only documents that may become part of the top‐k document results. In this paper, we present two parallel strategies for the WAND algorithm and compare their performance on GPUs. In our first strategy (named size‐based), the posting lists are evenly partitioned among thread blocks. Ou… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 21 publications
0
3
0
Order By: Relevance
“…Despite some similarities, this work focuses on reusing input data and intermediate results to reduce recomputing and I/O redundancies that cause bandwidth waste. Similar techniques that group work units into small batches have been used in other application areas, such as in web search engines [6,13]. In summary, these works use mini-batching for grouping processing units into larger ones that increase the utilization of resources in multi-core or distributed processing systems.…”
Section: Other Mini-batching Approachesmentioning
confidence: 99%
See 1 more Smart Citation
“…Despite some similarities, this work focuses on reusing input data and intermediate results to reduce recomputing and I/O redundancies that cause bandwidth waste. Similar techniques that group work units into small batches have been used in other application areas, such as in web search engines [6,13]. In summary, these works use mini-batching for grouping processing units into larger ones that increase the utilization of resources in multi-core or distributed processing systems.…”
Section: Other Mini-batching Approachesmentioning
confidence: 99%
“…To date, this is the first work to propose a strategy for improving memory access locality for parallel implementation of bagging ensembles on multi-core systems. Our work employs [6] Increase the utilization of resources Web search engines Carbone et al [7] Fault tolerance and performance Apache Flink Gaioso et al [13] Increase the utilization of resources Web search engines He et al [19] Reduce recomputing and IO redundancies Large-scale data streams Kukreja et al [26] Reduce data movement Large-scale FWI Wang et al [42] Energy optimization Real-time tasks on heterogeneous sensors Wen et al [45] Reduce data on weight matrix ANNs for image classification Zaharia et al [49] Fault tolerance and performance Apache Spark Zhang et al [50] Reduce delay and energy consumption DNNs on the edge both measurement techniques and theoretical foundations proposed in [48] to demonstrate the benefits of mini-batching for the implementation of ensembles. The present work is different from previous work as we focus on a class of ensemble algorithms composed of bagging ensembles executing in the context of data streams.…”
Section: How Our Work Is Different From Othersmentioning
confidence: 99%
“…Web search engines rely on computational expensive operations, like the WAND ranking algorithm, to process thousands of queries per second. Gaioso et al 5 proposed parallel strategies for single queries and batch of queries executed with the WAND ranking algorithm on GPUs. The authors presented and evaluated two strategies to partition the documents named size‐based and range‐based.…”
mentioning
confidence: 99%