Daniel Cederman scite author profile

Daniel Cederman

5Publications

194Citation Statements Received

58Citation Statements Given

How they've been cited

171

192

How they cite others

Affiliations

Chalmers University of Technology, TU Wien

Publications

Order By: Most citations

A Practical Quicksort Algorithm for Graphics Processors

Cederman

Tsigas

2008

121

View full text Add to dashboard Cite

Abstract. In this paper we present GPU-Quicksort, an efficient Quicksort algorithm suitable for highly parallel multi-core graphics processors. Quicksort has previously been considered as an inefficient sorting solution for graphics processors, but we show that GPU-Quicksort often performs better than the fastest known sorting implementations for graphics processors, such as radix and bitonic sort. Quicksort can thus be seen as a viable alternative for sorting large quantities of data on graphics processors.

show abstract

On sorting and load balancing on GPUs

Cederman

Tsigas

2008

SIGARCH Comput. Archit. News

View full text Add to dashboard Cite

In this paper we take a look at GPU-Quicksort, an efficient Quicksort algorithm suitable for the highly parallel multi-core graphics processors. Quicksort had previously been considered an inefficient sorting solution for graphics processors, but GPU-Quicksort often performs better than the fastest known sorting implementations for graphics processors, such as radix and bitonic sort. Quicksort can thus be seen as a viable alternative for sorting large quantities of data on graphics processors. We also take look at a comparison of different load balancing schemes. To get maximum performance on the many-core graphics processors it is important to have an even balance of the workload so that all processing units contribute equally to the task at hand. This can be hard to achieve when the cost of a task is not known beforehand and when new sub-tasks are created dynamically during execution. With the recent advent of scatter operations and atomic hardware primitives it is now possible to bring some of the more elaborate dynamic load balancing schemes from the conventional SMP systems domain to the graphics processor domain.

show abstract

Data structures for task-based priority scheduling

Wimmer

Versaci

Träff

et al. 2014

View full text Add to dashboard Cite

Many task-parallel applications can benefit from attempting to execute tasks in a specific order, as for instance indicated by priorities associated with the tasks. We present three lock-free data structures for priority scheduling with different trade-offs on scalability and ordering guarantees. First we propose a basic extension to work-stealing that provides good scalability, but cannot provide any guarantees for task-ordering in-between threads. Next, we present a centralized priority data structure based on k-fifo queues, which provides strong (but still relaxed with regard to a sequential specification) guarantees. The parameter k allows to dynamically configure the trade-off between scalability and the required ordering guarantee. Third, and finally, we combine both data structures into a hybrid, k-priority data structure, which provides scalability similar to the work-stealing based approach for larger k, while giving strong ordering guarantees for smaller k. We argue for using the hybrid data structure as the best compromise for generic, priority-based task-scheduling.We analyze the behavior and trade-offs of our data structures in the context of a simple parallelization of Dijkstra's single-source shortest path algorithm. Our theoretical analysis and simulations show that both the centralized and the hybrid k-priority based data structures can give strong guarantees on the useful work performed by the parallel Dijkstra algorithm. We support our results with experimental evidence on an 80-core Intel Xeon system.

show abstract

Data structures for task-based priority scheduling

et al. 2014

View full text Add to dashboard Cite

We present three lock-free data structures for priority task scheduling: a priority work-stealing one, a centralized one with ρ-relaxed semantics, and a hybrid one combining both concepts. With the single-source shortest path (SSSP) problem as example, we show how the different approaches affect the prioritization and provide upper bounds on the number of examined nodes. We argue that priority task scheduling allows for an intuitive and easy way to parallelize the SSSP problem, notoriously a hard task. Experimental evidence supports the good scalability of the resulting algorithm.The larger aim of this work is to understand the trade-offs between scalability and priority guarantees in task scheduling systems. We show that ρ-relaxation is a valuable technique for improving the first, while still allowing semantic constraints to be satisfied: the lock-free, hybrid k-priority data structure can scale as well as work-stealing, while still providing strong priority scheduling guarantees, which depend on the parameter k. Our theoretical results open up possibilities for even more scalable data structures by adopting a weaker form of ρ-relaxation, which still enables the semantic constraints to be respected.

show abstract

Dynamic Load Balancing Using Work-Stealing

Cederman¹,

Tsigas²

2012

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Daniel Cederman

A Practical Quicksort Algorithm for Graphics Processors

On sorting and load balancing on GPUs

Data structures for task-based priority scheduling

Data structures for task-based priority scheduling

Dynamic Load Balancing Using Work-Stealing

Contact Info

Product

Resources

About