2020
DOI: 10.48550/arxiv.2007.07298
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning

Abstract: As modern neural networks have grown to billions of parameters, meeting tight latency budgets has become increasingly challenging. Approaches like compression, sparsification and network pruning have proven effective to tackle this problembut they rely on modifications of the underlying network. In this paper, we look at a complimentary approach of optimizing how tensors are mapped to on-chip memory in an inference accelerator while leaving the network parameters untouched. Since different memory components tr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 35 publications
0
1
0
Order By: Relevance
“…Our work, by contrast, reports a Pareto frontier in three relevant dimensions including accuracy, memory compression and bit-operations as a surrogate metric for hardware performance, and can be easily extended to more objectives depending on the broader task or specific user needs. We also develop a graph representation of the workload inspired by Khadka et al [2020] allowing us to leverage graph convolutions trained by gradient-free neuroevolution, which to the best of our knowledge is a novel approach to mixed-precision quantization.…”
Section: Related Workmentioning
confidence: 99%
“…Our work, by contrast, reports a Pareto frontier in three relevant dimensions including accuracy, memory compression and bit-operations as a surrogate metric for hardware performance, and can be easily extended to more objectives depending on the broader task or specific user needs. We also develop a graph representation of the workload inspired by Khadka et al [2020] allowing us to leverage graph convolutions trained by gradient-free neuroevolution, which to the best of our knowledge is a novel approach to mixed-precision quantization.…”
Section: Related Workmentioning
confidence: 99%