PTask

Rossbach, Christopher J.; Currey, Jon; Silberstein, Mark; Ray, Baishakhi; Witchel, Emmett

doi:10.1145/2043556.2043579

Cited by 198 publications

(12 citation statements)

References 47 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There are multiple attempts to reduce the complexity of the GPGPU programming model through software [13,32]. While these frameworks simplify the code for straightforward applications, like the UVA implementation of the vectorcopy example presented, it is still difficult to represent complex data structures.…”

Section: Related Workmentioning

confidence: 99%

Supporting x86-64 address translation for 100s of GPU lanes

Power

Hill

Wood

2014

2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)

112

View full text Add to dashboard Cite

show abstract

Section: Related Workmentioning

confidence: 99%

Supporting x86-64 address translation for 100s of GPU lanes

Power

Hill

Wood

2014

2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)

112

View full text Add to dashboard Cite

show abstract

“…If a GPU kernel has been running for a long time, the Gdev scheduler assigns long slices of time to other GPU app kernels to achieve fair GPU utilization. PTask [28], where a GPGPU app is designed as a data flow graph that consists of GPU kernel modules, schedules GPU kernels when they are launched. These kernel-based schedulers suffer from the same problem as the command-based ones.…”

Section: Related Workmentioning

confidence: 99%

“…Scientific apps [5], [6] exclusively use GPUs to compute their simulations. Existing GPU resource managers, including GPU command-based schedulers [24]- [26], novel GPU kernel launchers [27], [28], and thread block schedulers [29], [30], fail to schedule GPU eaters appropriately since GPU eaters do not provide scheduling points such as kernel launches or thread block completion; thus, a hosted GPU eater may monopolize the GPU. Other techniques, such as context funneling [31], [32] and persistent threads [33], effectively schedule GPU eaters but fail to isolate GPGPU apps; thus, a hosted GPGPU app may access and modify the memory of other GPGPU apps, which is not suitable for multi-tenant cloud platforms.…”

Section: Introductionmentioning

confidence: 99%

Cooperative GPGPU Scheduling for Consolidating Server Workloads

Suzuki

Yamada

Kato

et al. 2018

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

Graphics processing units (GPUs) have become an attractive platform for general-purpose computing (GPGPU) in various domains. Making GPUs a time-multiplexing resource is a key to consolidating GPGPU applications (apps) in multi-tenant cloud platforms. However, advanced GPGPU apps pose a new challenge for consolidation. Such highly functional GPGPU apps, referred to as GPU eaters, can easily monopolize a shared GPU and starve collocated GPGPU apps. This paper presents GLoop, which is a software runtime that enables us to consolidate GPGPU apps including GPU eaters. GLoop offers an event-driven programming model, which allows GLoop-based apps to inherit the GPU eaters' high functionality while proportionally scheduling them on a shared GPU in an isolated manner. We implemented a prototype of GLoop and ported eight GPU eaters on it. The experimental results demonstrate that our prototype successfully schedules the consolidated GPGPU apps on the basis of its scheduling policy and isolates resources among them.

show abstract

“…However, the problem is different to using heterogeneous cores, whether GPUs [7,14] or others [18]. NICs mostly provide fixed hardware functions rather than programmable cores, and different NIC models, even within a vendor, offer very different features and configuration options 1 .…”

Section: Introductionmentioning

confidence: 99%

“…The dataflow model of computation also has a long history, and has recently been applied in parallel programming [1,6,14]. Dataflow representations of network processing are used in Click [8] and the x-kernel [3].…”

Section: Introductionmentioning

confidence: 99%

Modeling NICs with Unicorn

Shinde

Kaufmann

Kourtis

et al. 2013

Proceedings of the Seventh Workshop on Programming Languages and Operating Systems

View full text Add to dashboard Cite

NICs are increasingly complex and diverse, offering a wide range of hardware functionality to aid network protocol processing. Harnessing the power of NIC hardware requires the ability to control and reason about a variety of different feature sets in the network stack. Towards this goal, we propose Unicorn, a language for describing modern NICs. Unicorn offers a simple set of abstractions for modeling both NIC functionality and the state of a protocol stack. To evaluate its expressivity and potential, we present a nontrivial model for the Intel i82599 10GbE NIC, and an algorithm that uses graph embedding to optimize the use of NIC hardware in the network stack.

show abstract

PTask

Cited by 198 publications

References 47 publications

Supporting x86-64 address translation for 100s of GPU lanes

Supporting x86-64 address translation for 100s of GPU lanes

Cooperative GPGPU Scheduling for Consolidating Server Workloads

Modeling NICs with Unicorn

Contact Info

Product

Resources

About