Performance evaluation of single vs. batch of queries on GPUs

Gaioso, Roussian R. A.; Gil-Costa, Verónica; Guardia, Helio Crestana; Senger, Hermes

doi:10.1002/cpe.5474

Cited by 6 publications

(3 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Despite some similarities, this work focuses on reusing input data and intermediate results to reduce recomputing and I/O redundancies that cause bandwidth waste. Similar techniques that group work units into small batches have been used in other application areas, such as in web search engines [6,13]. In summary, these works use mini-batching for grouping processing units into larger ones that increase the utilization of resources in multi-core or distributed processing systems.…”

Section: Other Mini-batching Approachesmentioning

confidence: 99%

“…To date, this is the first work to propose a strategy for improving memory access locality for parallel implementation of bagging ensembles on multi-core systems. Our work employs [6] Increase the utilization of resources Web search engines Carbone et al [7] Fault tolerance and performance Apache Flink Gaioso et al [13] Increase the utilization of resources Web search engines He et al [19] Reduce recomputing and IO redundancies Large-scale data streams Kukreja et al [26] Reduce data movement Large-scale FWI Wang et al [42] Energy optimization Real-time tasks on heterogeneous sensors Wen et al [45] Reduce data on weight matrix ANNs for image classification Zaharia et al [49] Fault tolerance and performance Apache Spark Zhang et al [50] Reduce delay and energy consumption DNNs on the edge both measurement techniques and theoretical foundations proposed in [48] to demonstrate the benefits of mini-batching for the implementation of ensembles. The present work is different from previous work as we focus on a class of ensemble algorithms composed of bagging ensembles executing in the context of data streams.…”

Section: How Our Work Is Different From Othersmentioning

confidence: 99%

See 1 more Smart Citation

Improving the performance of bagging ensembles for data streams through mini-batching

Cassales,

Gomes,

Bifet

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Often, machine learning applications have to cope with dynamic environments where data are collected in the form of continuous data streams with potentially infinite length and transient behavior. Compared to traditional (batch) data mining, stream processing algorithms have additional requirements regarding computational resources and adaptability to data evolution. They must process instances incrementally because the data's continuous flow prohibits storing data for multiple passes. Ensemble learning achieved remarkable predictive performance in this scenario. Implemented as a set of (several) individual classifiers, ensembles are naturally amendable for task parallelism. However, the incremental learning and dynamic data structures used to capture the concept drift increase the cache misses and hinder the benefit of parallelism. This paper proposes a mini-batching strategy that can improve memory access locality and performance of several ensemble algorithms for stream mining in multicore environments. With the aid of a formal framework, we demonstrate that mini-batching can significantly decrease the reuse distance (and the number of cache misses). Experiments on six different state-of-the-art ensemble algorithms applying four benchmark datasets with varied characteristics show speedups of up to 5X on 8-core processors. These benefits come at the expense of a small reduction in predictive performance.

show abstract

Section: Other Mini-batching Approachesmentioning

confidence: 99%

Section: How Our Work Is Different From Othersmentioning

confidence: 99%

Improving the performance of bagging ensembles for data streams through mini-batching

Cassales,

Gomes,

Bifet

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Web search engines rely on computational expensive operations, like the WAND ranking algorithm, to process thousands of queries per second. Gaioso et al 5 proposed parallel strategies for single queries and batch of queries executed with the WAND ranking algorithm on GPUs. The authors presented and evaluated two strategies to partition the documents named size‐based and range‐based.…”

mentioning

confidence: 99%

High‐performance computing for computational science

Gil-Costa

Senger

2020

Concurrency and Computation

Self Cite

View full text Add to dashboard Cite

High-performance computing is the application of large-scale computer resources to solve computational problems that are either too large for standard computers or would take too long. Different parallel techniques, parallel and distributed programming libraries, and performance evaluation libraries are used to enhance the performance and to feature the execution of the algorithms. These techniques and tools are a valuable resource for anyone developing software codes for computational sciences. Moreover, all these techniques and tools have pushed the progress in several areas of science and engineering, which either demand large amounts of calculations or manipulate large volumes of data. This special issue focuses on efficient experimental solutions to problems on state-of-the-art computational systems consisting of large numbers of computational elements, including clusters, massively parallel supercomputers, and GPU-based systems. The objective is to open an opportunity for researchers to present and discuss new ideas and proposals for state-of-the art in HPC for computational science. The expected audience includes researchers and students in academic departments, government laboratories, and industrial organizations. This special issue

show abstract

SPARe: Supercharged Lexical Retrievers on GPU with Sparse Kernels

Almeida,

Matos

2024

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Performance evaluation of single vs. batch of queries on GPUs

Cited by 6 publications

References 21 publications

Improving the performance of bagging ensembles for data streams through mini-batching

Improving the performance of bagging ensembles for data streams through mini-batching

High‐performance computing for computational science

SPARe: Supercharged Lexical Retrievers on GPU with Sparse Kernels

Contact Info

Product

Resources

About