Training Hardware for Binarized Convolutional Neural Network Based on CMOS Invertible Logic

Shin, Duckgyu; Onizawa, Naoya; Gross, Warren J.; Hanyu, Takahiro

doi:10.1109/access.2020.3029576

Cited by 12 publications

(12 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In conventional works, dense random signals (r(t)) are used, such as uniform random signals of [-1:+1] [1], [5] or binary random signals of [-1,+1] [3], [4], [6] shown in Fig. 2 (b) and (c).…”

Section: Spin-state Update With Dense Random Signalmentioning

confidence: 99%

“…The bidirectional computing capability is realized by reducing the network energy to the global minimum energy with noise induced by random signals (e.g., a multiplier can be used as a factorizer in the backward mode). Due to the unique feature, several challenging problems can be quickly solved, such as integer factorization (e.g., cryptography problems [1]) and machine learning (e.g., training neural networks [3], [4]).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Sparse Random Signals for Fast Convergence on Invertible Logic

et al. 2021

Self Cite

View full text Add to dashboard Cite

This paper introduces sparse random signals for fast convergence on invertible logic. Invertible logic based on a network of probabilistic nodes (spins), similar to a Boltzmann machine, can compute functions bidirectionally by reducing the network energy to the global minimum with the addition of random signals. Here, we propose using sparse random signals that are generated by replacing a part of the typical dense random signals with zero values in probability. The sparsity of the random signals can induce a relatively relaxed transition of the spin network, reaching the global minimum energy at high probabilities. As a typical design example of invertible logic, invertible adders and multipliers are designed and evaluated. The simulation results show that the convergence speed with the proposed sparse random signals is roughly an order of magnitude faster than that with the conventional dense random signals. In addition, several key parameters are found and could be a guideline for fast convergence on general invertible logic.

show abstract

Section: Spin-state Update With Dense Random Signalmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Sparse Random Signals for Fast Convergence on Invertible Logic

et al. 2021

Self Cite

View full text Add to dashboard Cite

show abstract

“…The spin-gate circuit has been presented [9] for a fullyparallel architecture with a fixed Hamiltonian. Each spin contains a different number of inputs from other spins, where the Hamiltonian coefficients are hardwired for a dedicated application [5]- [7]. The fully-parallel manner restricts the invertible-logic hardware to small-scale Hamiltonians due to area limitations on application specific integrated circuit (ASIC) or field-programmable gate array (FPGA).…”

Section: B Hardware Implementation Of Cmos Invertible Logic Using Stochastic Computingmentioning

confidence: 99%

“…By reducing the network energy using random signals, the bidirectional operation can be realized probabilistically. The unique feature of the bidirectional computing can be applied for solving several critical issues, such as integer factorization (e.g., invertible multiplier operates as factorization at a backward mode) and training neural networks [5]- [7].…”

Section: Introductionmentioning

confidence: 99%

Hardware Acceleration of Large-Scale CMOS Invertible Logic Based on Sparse Hamiltonian Matrices

Onizawa

Tamakoshi

Hanyu

2021

IEEE Open J. Circuits Syst.

Self Cite

View full text Add to dashboard Cite

Invertible logic has been recently presented that can realize bidirectional computing based on Hamiltonians for solving several critical issues, such as integer factorization and training neural networks. However, a hardware architecture for supporting large-scale general-purpose invertible logic has not been studied. In this paper, we introduce a scalable hardware architecture based on sparse Hamiltonian matrices. In order to store and compute the Hamiltonians efficiently in hardware, a sparse matrix representation of PTELL (partitioned and transposed ELLPACK) is proposed. A memory size of PTELL can be smaller than that of a conventional ELL by reducing the number of paddings while parallel reading of non-zero values are realized for high-throughput operations. As a result, PTELL achieves around 1% and 10% memory usages of a conventional dense and ELL matrices, respectively, in case of invertible multipliers. In addition, the proposed hardware accelerator of invertible logic for supporting arbitrary Hamiltonians is implemented on Xilinx VU9P FPGA, which achieves around two orders of magnitude faster than a 16-core Intel Xeon implementation.

show abstract

“…cryptography problems [2]) and machine learning (e.g. training neural networks [3], [4]). The Hamiltonian is constructed by a network of spins (probabilistic nodes) with interactions among them.…”

Section: Introductionmentioning

confidence: 99%

High Convergence Rates of CMOS Invertible Logic Circuits Based on Many-Body Hamiltonians

Onizawa

Hanyu

2021

2021 IEEE International Symposium on Circuits and Systems (ISCAS)

Self Cite

View full text Add to dashboard Cite

This paper introduces CMOS invertible-logic (CIL) circuits based on many-body Hamiltonians. CIL can realize probabilistic forward and backward operations of a function by annealing a corresponding Hamiltonian using stochastic computing. We have created a Hamiltonian that includes three-body interaction of spins (probabilistic nodes). It provides some degrees of freedom to design a simpler landscape of Hamiltonian (energy) than that of the conventional two-body Hamiltonian. The simpler landscape makes it easier to reach the global minimum energy. The proposed three-body CIL circuits are designed and evaluated with the conventional twobody CIL circuits, resulting in few-times higher convergence rates with negligible area overhead on FPGA.

show abstract

Training Hardware for Binarized Convolutional Neural Network Based on CMOS Invertible Logic

Cited by 12 publications

References 24 publications

Sparse Random Signals for Fast Convergence on Invertible Logic

Sparse Random Signals for Fast Convergence on Invertible Logic

Hardware Acceleration of Large-Scale CMOS Invertible Logic Based on Sparse Hamiltonian Matrices

High Convergence Rates of CMOS Invertible Logic Circuits Based on Many-Body Hamiltonians

Contact Info

Product

Resources

About