2013 IEEE International Symposium on Parallel &Amp; Distributed Processing, Workshops and PHD Forum 2013
DOI: 10.1109/ipdpsw.2013.166
|View full text |Cite
|
Sign up to set email alerts
|

Algorithm/Architecture Codesign of Low Power and High Performance Linear Algebra Compute Fabrics

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 107 publications
(154 reference statements)
0
2
0
Order By: Relevance
“…The workload consists of N fine-grained concurrent execution segments, requiring T Re f = W L * 1 cycles to execute on a baseline reference floatingpoint engine capable of performing 1FLOP/cycle (single-precision floating-point operation per cycle). Such a floating-point engine consumes an area of 0.01mm 2 in 45nm [Pedram 2013] and dissipates 10mW [Keckler et al 2011]. With 22nm process technology, the same floating-point engine would consume an area of 0.003mm 2 and dissipate roughly 5mW [Cassidy and Andreou 2012;Keckler et al 2011].…”
Section: Analytic Model and Comparative Analysismentioning
confidence: 99%
“…The workload consists of N fine-grained concurrent execution segments, requiring T Re f = W L * 1 cycles to execute on a baseline reference floatingpoint engine capable of performing 1FLOP/cycle (single-precision floating-point operation per cycle). Such a floating-point engine consumes an area of 0.01mm 2 in 45nm [Pedram 2013] and dissipates 10mW [Keckler et al 2011]. With 22nm process technology, the same floating-point engine would consume an area of 0.003mm 2 and dissipate roughly 5mW [Cassidy and Andreou 2012;Keckler et al 2011].…”
Section: Analytic Model and Comparative Analysismentioning
confidence: 99%
“…Using the silicon area figures for CAM [10], RAM and floating point unit [1], and assuming the number of acceleration modules = 15 and CAM/RAM array height ℎ = 2 , we estimate the area of the CMOS SpMSpV accelerator at 90 in 22nm technology node. As CMOS feature scaling slows down, conventional memory technology experiences scalability problems.…”
Section: Resistive Implementationmentioning
confidence: 99%