“…Deep neural networks (DNNs) have been widely used for solving complex problems across a wide range of domains, including computer vision, speech processing, and robotics [1][2][3][4], while DNNs can achieve remarkable results on high-performance cloudservers, it is still expected to perform efficiently when used locally on mobile/embedded devices, due to connectivity and latency limitations, as well as privacy and security concerns [5,6]. Since mobile devices have tight latency, throughput, and energy constraints, many specialized DNN-inference accelerators, which achieve compelling results compared to traditional CPUs and GPUs, have been proposed [7].…”