2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) 2016
DOI: 10.1109/micro.2016.7783744
|View full text |Cite
|
Sign up to set email alerts
|

Concise loads and stores: The case for an asymmetric compute-memory architecture for approximation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
25
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 24 publications
(26 citation statements)
references
References 30 publications
1
25
0
Order By: Relevance
“…Reducing the precision of floating point [6,21,42] and fixed point [22] numbers has been used to alleviate the memory bandwidth bottleneck in deep neural networks [22], GPU workloads [42] and other approximation tolerant applications [21], thereby improving performance and energy efficiency. However, the compression ratio is still limited between 2:1 and 4:1 despite the loss of precision as these approaches do not exploit inter-value similarities to compress data.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Reducing the precision of floating point [6,21,42] and fixed point [22] numbers has been used to alleviate the memory bandwidth bottleneck in deep neural networks [22], GPU workloads [42] and other approximation tolerant applications [21], thereby improving performance and energy efficiency. However, the compression ratio is still limited between 2:1 and 4:1 despite the loss of precision as these approaches do not exploit inter-value similarities to compress data.…”
Section: Related Workmentioning
confidence: 99%
“…Similar to most techniques that focus on data approximations [21,36,39], AVR considers that the programmer annotates memory regions that can be approximated and hence compressed in a lossy manner. This annotation also includes the size of the region as well as the datatype of the approximable data.…”
Section: Memory Blocksmentioning
confidence: 99%
See 1 more Smart Citation
“…Register-width annotations can be used to enable optimizations to, for instance, functional units (e.g., SIMD-style parallelism [14]), cache systems [6], bandwidth utilization [21], and register file organizations [12]. The register file is of particular interest in GPU architectures.…”
Section: Motivationmentioning
confidence: 99%
“…The work by Jain et al [6] also investigates the Mantissa truncation format, but in the context of optimizations in the CPU memory hierarchy. Since they target the memory hierarchy, this is orthogonal to our work.…”
Section: Related Workmentioning
confidence: 99%