2009 IEEE International Symposium on Performance Analysis of Systems and Software 2009
DOI: 10.1109/ispass.2009.4919648
|View full text |Cite
|
Sign up to set email alerts
|

Analyzing CUDA workloads using a detailed GPU simulator

Abstract: Modern Graphic Processing Units (GPUs)

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
825
0
10

Year Published

2010
2010
2018
2018

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 1,334 publications
(855 citation statements)
references
References 23 publications
2
825
0
10
Order By: Relevance
“…GPU simulators running CUDA's intermediate language PTX such as GPUSim [4] or Ocelot [5] can offer a greater accuracy, but still run an unoptimized intermediate code instead of the instructions actually executed by a GPU.…”
Section: Barra a Functional Simulator Of Nvidia Gpusmentioning
confidence: 99%
“…GPU simulators running CUDA's intermediate language PTX such as GPUSim [4] or Ocelot [5] can offer a greater accuracy, but still run an unoptimized intermediate code instead of the instructions actually executed by a GPU.…”
Section: Barra a Functional Simulator Of Nvidia Gpusmentioning
confidence: 99%
“…To model a GPU-based address space with an allocated set of MCs, we employ GPGPUsim [3] simulator, which already contains a module that implement multiple DDRbased MC-system. Memory transactions are generated by the multiple GPU caches and and treated in the memory module of GPGPUsim.…”
Section: Methodsmentioning
confidence: 99%
“…For example, the GPU core configuration used in our evaluation employs a 24-stage pipeline with SIMD width of 8 (This is in line with a contemporary NVIDIA GTX280 architecture [11]. Similar pipeline configurations are also widely used in research GPU models [12,13].). Hence, assuming 4 extra stages for tolerating shared memory bank conflicts, the pipeline depth is increased from 24 stages to 28.…”
Section: Latency and Bandwidth Implicationsmentioning
confidence: 99%
“…Experimental Setup: We use a modified version of GPGPU-Sim [12], which is a cycle accurate full system simulator for GPUs implementing ptx ISA [19]. We model GPU cores with a 24-stage pipeline similar to contemporary implementations [11,13].…”
Section: Experimental Evaluationmentioning
confidence: 99%