A 128 channel 290 GMACs/W machine learning based co-processor for intention decoding in brain machine interfaces

Chen, Yi; Yao, Enyi; Basu, Arindam

doi:10.1109/iscas.2015.7169319

Cited by 18 publications

(23 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is important to consider how the performance of the chip varies in the face of variations of power supply voltage (VDD) and temperature. We use the normalization method suggested in [18] to increase the robustness of our chip with respect to common-mode variations in VDD and temperature. Following, [18], we define the j-th normalized hidden layer value (h j,norm ) as:…”

Section: F Robustnessmentioning

confidence: 99%

VLSI Extreme Learning Machine: A Design Space Exploration

Yao

Basu

2017

IEEE Trans. VLSI Syst.

Self Cite

View full text Add to dashboard Cite

In this paper, we describe a compact low-power high-performance hardware implementation of extreme learning machine for machine learning applications. Mismatches in current mirrors are used to perform the vector-matrix multiplication that forms the first stage of this classifier and is the most computationally intensive. Both regression and classification (on UCI data sets) are demonstrated and a design space tradeoff between speed, power, and accuracy is explored. Our results indicate that for a wide set of problems, σ V T in the range of 15-25 mV gives optimal results. An input weight matrix rotation method to extend the input dimension and hidden layer size beyond the physical limits imposed by the chip is also described. This allows us to overcome a major limit imposed on most hardware machine learners. The chip is implemented in a 0.35-μm CMOS process and occupies a die area of around 5 mm × 5 mm. Operating from a 1 V power supply, it achieves an energy efficiency of 0.47 pJ/MAC at a classification rate of 31.6 kHz.Index Terms-Classifier, extreme learning machine (ELM), low power, machine learning, neural networks. 1063-8210Enyi Yao received the B.Eng. degree from the Harbin Institute of Technology, Harbin, China, in 2011. He is currently pursuing the Ph.D. degree in electrical and electronic engineering with the Nanyang Technological University, Singapore.His current research interests include low power analog, mixed-signal IC design, neuromorphic circuits design, and low power smart sensor circuits design for biomedical applications.Arindam Basu received the B.Tech. and M.Tech.

show abstract

Section: F Robustnessmentioning

confidence: 99%

VLSI Extreme Learning Machine: A Design Space Exploration

Yao

Basu

2017

IEEE Trans. VLSI Syst.

Self Cite

View full text Add to dashboard Cite

show abstract

“…We validated the technique of increasing the number of weight vectors by rotation using a software model in MATLAB. To model an independent set of log-normal weights due to mismatch of sub-threshold transistors [17,23], we created a set of weights w ij = e x where x follows a gaussian distribution with 0 mean and standard deviation of 0.6. The reason for choosing this standard deviation is that the measured standard deviation of threshold voltage in this 0.35µm CMOS process was 0.6U T where U T denotes the thermal voltage kT /q.…”

Section: Software Modeling and Validation Of Weight Rotationmentioning

confidence: 99%

“…There are several reported hardware architectures exploiting randomness in VLSI for ELM [15][16][17]. Of these, [15] shows the application of ELM to a single input single output regression problem.…”

Section: Introductionmentioning

confidence: 99%

“…Of these, [15] shows the application of ELM to a single input single output regression problem. On the other hand, [16,17] have already shown good accuracy at the system level for applications like intention decoding [17] and spike sorting [16] requiring multiple inputs and outputshence, we pursue this architecture further. The first novelty of this paper is in applying such a hardware to image based object recognition applications.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Hardware architecture for large parallel array of Random Feature Extractors applied to image recognition

et al. 2017

Self Cite

View full text Add to dashboard Cite

We demonstrate a low-power and compact hardware implementation of Random Feature Extractor (RFE) core. With complex tasks like Image Recognition requiring a large set of features, we show how weight reuse technique can allow to virtually expand the random features available from RFE core. Further, we show how to avoid computation cost wasted for propagating "incognizant" or redundant random features. For proof of concept, we validated our approach by using our RFE core as the first stage of Extreme Learning Machine (ELM)-a two layer neural network-and were able to achieve > 97% accuracy on MNIST database of handwritten digits. ELM's first stage of RFE is done on an analog ASIC occupying 5mm×5mm area in 0.35µm CMOS and consuming 5.95 µJ/classify while using ≈ 5000 effective hidden neurons. The ELM second stage consisting of just adders can be implemented as digital circuit with estimated power consumption of 20.9 nJ/classify. With a total energy consumption of only 5.97 µJ/classify, this low-power mixed signal ASIC can act as a co-processor in portable electronic gadgets with cameras.

show abstract

“…K-means clustering is utilized for unsupervised classification of the pulse-count features. In future, an ML such as the one in [64] will be used to perform the decoding directly.…”

Section: Objectivesmentioning

confidence: 99%

Low power smart sensor circuits for biomedical applications : applications to neural interfaces

Yao¹

Self Cite

View full text Add to dashboard Cite

mark binary classification datasets have been employed in the measurement showing that the performance of our design is comparable with recent publications and software simulations of other machine learning system. This system was implemented in a 0.35 µm CMOS process which can operate from 0.8 V to 3.3 V power supply with a lowest classification energy 0.47 nJ/op and maximum classification speed 518 MMAC/s. v vi

show abstract

A 128 channel 290 GMACs/W machine learning based co-processor for intention decoding in brain machine interfaces

Cited by 18 publications

References 11 publications

VLSI Extreme Learning Machine: A Design Space Exploration

VLSI Extreme Learning Machine: A Design Space Exploration

Hardware architecture for large parallel array of Random Feature Extractors applied to image recognition

Low power smart sensor circuits for biomedical applications : applications to neural interfaces

Contact Info

Product

Resources

About