Recently, the posit numerical format has shown promise for DNN data representation and compute with ultralow precision ([5..8]-bit). However, majority of studies focus only on DNN inference. In this work, we propose DNN training using posits and compare with the floating point training. We evaluate on both MNIST and Fashion MNIST corpuses, where 16-bit posits outperform 16-bit floating point for end-to-end DNN training.Index Terms-Deep neural networks, low-precision arithmetic, posit numerical format
I. INTRODUCTIONThe edge computing, offers a decentralized solution to cloud-based datacenters [1] and intelligence-at-the-edge of mobile networks. However, training on the edge is a challenge for many deep neural networks (DNNs). This arises due to the significant cost of multiply-and-accumulate (MAC) units, an ubiquitous operation in all DNNs. In a 45 nm CMOS process, energy consumption doubles from 16-bit floats to 32-bit floats for addition and by ∼4× for multiplication [2]. Memory access cost increases by ∼10× from 8 kB to 1 MB memory with a 64-bit cache [2]. In general, there is a gap between memory storage, bandwidth, compute requirements, and energy consumption of modern DNNs and hardware resources available on edge devices [3].An apparent solution to address this gap is to compress such networks, thus reducing the compute requirements to match putative edge resources. Several groups have proposed compressed new compute-and memory-efficient DNN architectures [4]-[6] and parameter-efficient neural networks, using methods such as DNN pruning [7], distillation [8], and low-precision arithmetic [9], [10]. Among these approaches, low-precision arithmetic is noted for its ability to reduce memory capacity, bandwidth, latency, and energy consumption associated with MAC units in DNNs and an increase in the level of data parallelism [9], [11], [12].The ultimate goal of low-precision DNN design is to reduce the original hardware complexity of the high-precision DNN model to a level suitable for edge devices without significantly degrading performance.To address the gaps in previous studies, we are motivated to study low-precision posit for DNN training on the edge.
II. POSIT NUMERICAL FORMATAn alternative to IEEE-754 floating point numbers, posits were recently introduced and exhibit a tapered-precision char-