Proceedings of the 5th High-Performance Graphics Conference 2013
DOI: 10.1145/2492045.2492060
|View full text |Cite
|
Sign up to set email alerts
|

Megakernels considered harmful

Abstract: When programming for GPUs, simply porting a large CPU program into an equally large GPU kernel is generally not a good approach. Due to SIMT execution model on GPUs, divergence in control flow carries substantial performance penalties, as does high register usage that lessens the latency-hiding capability that is essential for the high-latency, high-bandwidth memory system of a GPU. In this paper, we implement a path tracer on a GPU using a wavefront formulation, avoiding these pitfalls that can be especially … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
14
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 74 publications
(15 citation statements)
references
References 20 publications
1
14
0
Order By: Relevance
“…Rendering We use our own path tracer (PT) rendering pipeline as baseline engine. This PT uses an implementation inspired by Wavefront [VA11, LKA13], powered by Embree [WWB * 14]. We distribute render work in tiles of (25,25) pixels that perform path–tracing in usual Wavefront steps (ray generation, intersection, shading, connection), each tile using a separate CPU thread.…”
Section: Implementation Detailsmentioning
confidence: 99%
“…Rendering We use our own path tracer (PT) rendering pipeline as baseline engine. This PT uses an implementation inspired by Wavefront [VA11, LKA13], powered by Embree [WWB * 14]. We distribute render work in tiles of (25,25) pixels that perform path–tracing in usual Wavefront steps (ray generation, intersection, shading, connection), each tile using a separate CPU thread.…”
Section: Implementation Detailsmentioning
confidence: 99%
“…The uncoupled, highly parallel and rather simple nature of the optical physics that is sufficient to describe neutrino detectors makes optical photon propagation well suited to general purpose GPU computing techniques where high performance requires massive parallelism with minimal communication between threads and low register usage [11].…”
Section: Introductionmentioning
confidence: 99%
“…-Summarises the underlying split-kernel architecture [11] that is state-of-the-art for performant GPU path tracers.…”
Section: Thesis Aims and Contributionsmentioning
confidence: 99%
“…Since GPUs perform best on coherent workloads, it is often beneficial to sort the data before a GPU kernel operates on it. For example, in Megakernels Considered Harmful [11] Laine et. al.…”
Section: Gpu Radix Sortmentioning
confidence: 99%
See 1 more Smart Citation