Although research on the inference phase of edge artificial intelligence (AI) has made considerable improvement, the required training phase remains an unsolved problem. Neural network (NN) processing has two phases: inference and training. In the training phase, a NN incurs high calculation cost. The number of bits (bitwidth) in the training phase is several orders of magnitude larger than that in the inference phase. Training algorithms, optimized to software, are not appropriate for training hardware-oriented NNs. Therefore, we propose a new training algorithm for edge AI: backpropagation (BP) using a ternarized gradient. This ternarized backpropagation (TBP) provides a balance between calculation cost and performance. Empirical results demonstrate that in a two-class classification task, TBP works well in practice and compares favorably with 16-bit BP (Fixed-BP).
In the era of Internet of Things, the battery life of edge devices must be extended for sensing connection to the Internet. We aim to reduce the power consumption of the microprocessor embedded in such devices by using a novel dynamically reconfigurable accelerator. Conventional microprocessors consume a large amount of power for memory access, in registers, and for the control of the processor itself rather than computation; this decreases the energy efficiency. Dynamically reconfigurable accelerators reduce such redundant power by computing in parallel on reconfigurable switches and processing element arrays (often consisting of an arithmetic logic unit (ALU) and registers). We propose a novel dynamically reconfigurable accelerator "DYNaSTA" composed of a dynamically reconfigurable data path and static ALU arrays. The static ALU arrays process instructions in parallel without registers and improve energy efficiency. The dynamically reconfigurable data path includes registers and many switches dynamically reconfigured to resolve operand dependencies between instructions mapped on the static ALU array, and forwards appropriate operands to the static ALU array. Therefore, the DYNaSTA accelerator has more flexibility while improving the energy efficiency compared with the conventional dynamically reconfigurable accelerators. We simulated the power consumption of the proposed DYNaSTA accelerator and measured the fabricated chip. As a result, the power consumption was reduced by 69% to 86%, and the energy efficiency improved 4.5 to 13 times compared to a general RISC microprocessor.
Neural networks architectures will be soon required to be deployed on IoT-edge systems, which will demand small and low-power circuits and application-dependent customizability. Focusing on the power-efficiency and internal structures of 3D memristive devices, this paper proposes neural networks architectures that are implementable in those devices, together with an existing transfer learning technique. Our architectural model is stochastic and device-conscious in handling noise/variation-vulnerability and suppressing inter-layer connections. Through experiments, we quantitatively explored appropriate structures under device constraints and demonstrated comparable performance as conventional work holding complex connections. Also, we revealed an important challenge on device technologies for further performance improvement.
Demands for low-energy microcontrollers have been increasing in recent years. Since most microcontrollers achieve user programmability by integrating nonvolatile (NV) memories such as flash memories for storing their programs, the large power consumption required in accessing an NV memory has become a major problem. This problem becomes critical when the power supply voltage of NV microcontrollers is decreased. We can solve this problem by introducing an instruction cache, thus reducing the access frequency of the NV memory. Unlike general-purpose microprocessors, microcontrollers used for real-time applications in embedded systems must accurately calculate program execution time prior to its execution. Therefore, we introduce a "transparent" instruction cache, which does not change the existing NV microcontroller's cycle-level execution time, for reducing power and energy consumption, but not for improving the processing speed. We have conducted detailed microar chitecture design based on the architecture of a major industrial microcontroller, and we evaluated power and energy consumption for several benchmark programs. Our evaluation shows that the proposed instruction cache can successfully reduce energy consumption in a fairly wide range of practical NV microcontroller configurations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.