A Power-Efficient Accelerator Based on FPGAs for LSTM Network

Zhang, Yiwei; Wang, Chao; Gong, Lei; Lu, Yuntao; Sun, Fangmiao; Xu, Chongchong; Li, Xi; Zhou, Xuehai

doi:10.1109/cluster.2017.45

Cited by 24 publications

(10 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In future studies, a wider range of activities should be included to provide more information about health-related daily PAs. Though we achieved a high classification performance using the RF classifier, applying other advanced machine learning models such as recurrent neural networks including long short-term memory (LSTM) networks [45,46] and comparing their performance may be considered as a future study. Finally, we trained the models using data collected by young healthy adults only.…”

Section: Contributions and Limitationsmentioning

confidence: 99%

Using Accelerometer and GPS Data for Real-Life Physical Activity Type Detection

Allahbakhshi

Conrow

Naimi

et al. 2020

Sensors

View full text Add to dashboard Cite

This paper aims to examine the role of global positioning system (GPS) sensor data in real-life physical activity (PA) type detection. Thirty-three young participants wore devices including GPS and accelerometer sensors on five body positions and performed daily PAs in two protocols, namely semi-structured and real-life. One general random forest (RF) model integrating data from all sensors and five individual RF models using data from each sensor position were trained using semi-structured (Scenario 1) and combined (semi-structured + real-life) data (Scenario 2). The results showed that in general, adding GPS features (speed and elevation difference) to accelerometer data improves classification performance particularly for detecting non-level and level walking. Assessing the transferability of the models on real-life data showed that models from Scenario 2 are strongly transferable, particularly when adding GPS data to the training data. Comparing individual models indicated that knee-models provide comparable classification performance (above 80%) to general models in both scenarios. In conclusion, adding GPS data improves real-life PA type classification performance if combined data are used for training the model. Moreover, the knee-model provides the minimal device configuration with reliable accuracy for detecting real-life PA types.

show abstract

Section: Contributions and Limitationsmentioning

confidence: 99%

Using Accelerometer and GPS Data for Real-Life Physical Activity Type Detection

Allahbakhshi

Conrow

Naimi

et al. 2020

Sensors

View full text Add to dashboard Cite

show abstract

“…Zhang et al (2017) deployed an efficient accelerator which targeted for the LSTM networks execution. They minimized the power consumption, time and energy by pipelining the execution behaviour for matrix multiplication operations, element wise computation etc.…”

Section: Related Workmentioning

confidence: 99%

P‐SCADA ‐ A novel area and energy efficient FPGA architectures for LSTM prediction of heart arrthymias in biot applications

Varadharajan¹,

Nallasamy²

2021

Expert Systems

View full text Add to dashboard Cite

Recurrent neural networks (RNN) are extensively used to determine the optimal solutions to the various class recognition problems such as image processing, prediction of biomedical data and speech recognition. With the gradient problems, RNN is slowing losing its shade which is replaced by the Long short term memory (LSTM). However the hardware implementation of the LSTM requires more challenge due to its complexity and high power consumption which makes it unsuitable for implementing in Biological Internet of things networks for prediction of heart diseases. Several algorithms were proposed for an effective implementation of LSTM, but hand‐offs between the performance and utilization still needs improvisation. The paper proposes the novel energy efficient and high performance architecture Pipelined Stochastic Adaptive Distributed Architectures (P‐SCADA) for LSTM networks. In this architecture, hybrid structure has been developed with the help of new distributed arithmetic stochastic computing (DSC) along with the binary circuits to advance the performance of the FPGA such as energy, area and accuracy. The proposed system has been implemented in ARTIX‐7 FPGA with special purpose software has been designed and evaluated with different ECG datasets. For the different series data, area utilization is about 40%–44% and power consumption is about 20%–25% with the prediction of accuracy of 98%. Moreover the proposed architecture has been compared with the other existing architecture such as SPARSE architectures, normal stochastic architectures in which the proposed architecture excels in terms area, power and efficiency.

show abstract

“…As shown in Table 1, the trained TM exhibits significant performance improvement over the SM trained in the standard cross-entropy loss. Nevertheless, it is worth noting that the training time of the SM is only about a quarter of the TM's training time, and the testing time of the SM is only about one-sixth of the TM's testing time as the LSTM-RNN operation is very time-consuming [30]. What's more, as illustrated in Table 1, the teaching loss makes the SM obtain the performance improvement of 7.21%, which shows the proposed CMDL is effective for the performance boost of a lightweight model.…”

Section: ) Performance Analysis Of Cmdlmentioning

confidence: 99%

“…For example, in [28] the network contains 313,603 parameters with one LSTM-RNN layer, four convolution layers, and two fully connected layers. In addition, the time complexity of the models comprised of CNN and LSTM-RNN layers is high in training or prediction as the LSTM-RNN operation is timeconsuming [30]. Thus, these networks take a long time to automatically predict the types of signals.…”

Section: Introductionmentioning

confidence: 99%

Cross Model Deep Learning Scheme for Automatic Modulation Classification

Meng

et al. 2020

IEEE Access

View full text Add to dashboard Cite

Deep Neural Networks (DNNs) have achieved remarkable accuracy improvements for automatic modulation classification. However, the employed networks often have millions of parameters and need very high computation, which makes it difficult to deploy these models on portable devices with limited resources. We propose a cross model deep learning scheme to build a lightweight deep network for accurate modulation classification. Firstly, a large Hybrid DNN (HDNN) that is composed of convolutional and recurrent layers is constructed and trained for automatic and accurate classification of signals. Then we build a smaller Layered Resnet Network (LRN) with shallow layers and few nodes. The HDNN and LRN are taken as a Teacher Model (TM) and a Student Model (SM) respectively. Finally, a knowledge distillation method is proposed to guide the learning of the SM, by formulating a teaching loss from the prediction of the TM to train the SM. The performances of the proposed HDNN and LRN are investigated on the public RadioML2016.10a and RadioML2016.10b data sets. The experimental results show that the trained HDNN presents state-of-the-art classification results and the LRN trained in this scheme takes only about a sixth of the HDNN's inference time and consumes only 472.3KB for storage, with a slight accuracy decrease compared with the large HDNN. INDEX TERMS Automatic modulation classification, cross model deep learning, layered Resnet network.

show abstract

A Power-Efficient Accelerator Based on FPGAs for LSTM Network

Cited by 24 publications

References 5 publications

Using Accelerometer and GPS Data for Real-Life Physical Activity Type Detection

Using Accelerometer and GPS Data for Real-Life Physical Activity Type Detection

P‐SCADA ‐ A novel area and energy efficient FPGA architectures for LSTM prediction of heart arrthymias in biot applications

Cross Model Deep Learning Scheme for Automatic Modulation Classification

Contact Info

Product

Resources

About