SC14: International Conference for High Performance Computing, Networking, Storage and Analysis 2014
DOI: 10.1109/sc.2014.73
|View full text |Cite
|
Sign up to set email alerts
|

Scaling the Power Wall: A Path to Exascale

Abstract: Abstract-Modern scientific discovery is driven by an insatiable demand for computing performance. The HPC community is targeting development of supercomputers able to sustain 1 ExaFlops by the year 2020 and power consumption is the primary obstacle to achieving this goal. A combination of architectural improvements, circuit design, and manufacturing technologies must provide over a 20× improvement in energy efficiency. In this paper, we present some of the progress NVIDIA Research is making toward the design o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
38
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 100 publications
(38 citation statements)
references
References 28 publications
0
38
0
Order By: Relevance
“…A direct port of the SNAP mini-app using the CUDA framework by P. Wang et al exists and uses a similar parallelisation scheme to the original code [23]. In our previous work we showed that on a single node this approach did not improve the performance over our benchmark CPU despite the use of GPUs [9].…”
Section: Related Workmentioning
confidence: 92%
“…A direct port of the SNAP mini-app using the CUDA framework by P. Wang et al exists and uses a similar parallelisation scheme to the original code [23]. In our previous work we showed that on a single node this approach did not improve the performance over our benchmark CPU despite the use of GPUs [9].…”
Section: Related Workmentioning
confidence: 92%
“…We have used CACTI for estimating the power consumption of caches; however since our accelerator is synthesized for 22nm, we have scaled down the area and power values generated by CACTI from 32nm to 22nm. For scaling area, coefficient of 0.5 is used [9,10], whereas for scaling power, coefficients of 0.569 [16] (dynamic) and 0.8 [32] (leakage) are used. For DRAM power consumption of both template-based and the HLS accelerators, we have integrated the DRAMSim2 tool into our simulators and used the aforementioned DDR4 model for power and timing estimations.…”
Section: ) Power Performance and Area Resultsmentioning
confidence: 99%
“…Modern Central Processing Unit (CPU) performance and speed have begun to plateau over recent years due to "the power wall," thus prompting more research into multi-core and many-core systems [30]. General Purpose Graphics Processing Unit (GPGPU) programming is a programming paradigm which employs Graphics Processing Units (GPUs) to run code typically executed on the CPU in order to provide performance gain by way of parallelism and data throughput.…”
Section: The Problem Statementmentioning
confidence: 99%