“…Scientific apps [5], [6] exclusively use GPUs to compute their simulations. Existing GPU resource managers, including GPU command-based schedulers [24]- [26], novel GPU kernel launchers [27], [28], and thread block schedulers [29], [30], fail to schedule GPU eaters appropriately since GPU eaters do not provide scheduling points such as kernel launches or thread block completion; thus, a hosted GPU eater may monopolize the GPU. Other techniques, such as context funneling [31], [32] and persistent threads [33], effectively schedule GPU eaters but fail to isolate GPGPU apps; thus, a hosted GPGPU app may access and modify the memory of other GPGPU apps, which is not suitable for multi-tenant cloud platforms.…”