Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning

Khadka, Shauharda; Aflalo, Estelle; Marder, Mattias; Ben-David, Avrech; Miret, Santiago; Mannor, Shie; Hazan, Tamir; Tang, Hanlin; Majumdar, Somdeb

doi:10.48550/arxiv.2007.07298

Cited by 1 publication

(1 citation statement)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our work, by contrast, reports a Pareto frontier in three relevant dimensions including accuracy, memory compression and bit-operations as a surrogate metric for hardware performance, and can be easily extended to more objectives depending on the broader task or specific user needs. We also develop a graph representation of the workload inspired by Khadka et al [2020] allowing us to leverage graph convolutions trained by gradient-free neuroevolution, which to the best of our knowledge is a novel approach to mixed-precision quantization.…”

Section: Related Workmentioning

confidence: 99%

Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision Quantization

Miret¹,

Chua²,

Marder³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Mixed-precision quantization is a powerful tool to enable memory and compute savings of neural network workloads by deploying different sets of bit-width precisions on separate compute operations. Recent research has shown significant progress in applying mixed-precision quantization techniques to reduce the memory footprint of various workloads, while also preserving task performance. Prior work, however, has often ignored additional objectives, such as bit-operations, that are important for deployment of workloads on hardware. Here we present a flexible and scalable framework for automated mixed-precision quantization that optimizes multiple objectives. Our framework relies on Neuroevolution-Enhanced Multi-Objective Optimization (NEMO), a novel search method, to find Pareto optimal mixed-precision configurations for memory and bit-operations objectives. Within NEMO, a population is divided into structurally distinct sub-populations (species) which jointly form the Pareto frontier of solutions for the multi-objective problem. At each generation, species are re-sized in proportion to the goodness of their contribution to the Pareto frontier. This allows NEMO to leverage established search techniques and neuroevolution methods to continually improve the goodness of the Pareto frontier. In our experiments we apply a graph-based representation to describe the underlying workload, enabling us to deploy graph neural networks trained by NEMO to find Pareto optimal configurations for various workloads trained on ImageNet. Compared to the state-of-the-art, we achieve competitive results on memory compression and superior results for compute compression for MobileNet-V2, ResNet50 and ResNeXt-101-32x8d, one of the largest ImageNet models amounting to a search space of ∼ 10 146 . A deeper analysis of the results obtained by NEMO also shows that both the graph representation and the speciesbased approach are critical in finding effective configurations for all workloads.

show abstract

Section: Related Workmentioning

confidence: 99%