Jan Lappas scite author profile

Jan Lappas

5Publications

24Citation Statements Received

0Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Kaiserslautern

Publications

Order By: Most citations

Efficient Hardware Architectures for 1D- and MD-LSTM Networks

Rybalkin

Sudarshan

Weis

et al. 2020

J Sign Process Syst

View full text Add to dashboard Cite

Recurrent Neural Networks, in particular One-dimensional and Multidimensional Long Short-Term Memory (1D-LSTM and MD-LSTM) have achieved state-of-the-art classification accuracy in many applications such as machine translation, image caption generation, handwritten text recognition, medical imaging and many more. However, high classification accuracy comes at high compute, storage, and memory bandwidth requirements, which make their deployment challenging, especially for energy-constrained platforms such as portable devices. In comparison to CNNs, not so many investigations exist on efficient hardware implementations for 1D-LSTM especially under energy constraints, and there is no research publication on hardware architecture for MD-LSTM. In this article, we present two novel architectures for LSTM inference: a hardware architecture for MD-LSTM, and a DRAM-based Processing-in-Memory (DRAM-PIM) hardware architecture for 1D-LSTM. We present for the first time a hardware architecture for MD-LSTM, and show a trade-off analysis for accuracy and hardware cost for various precisions. We implement the new architecture as an FPGA-based accelerator that outperforms NVIDIA K80 GPU implementation in terms of runtime by up to 84× and energy efficiency by up to 1238× for a challenging dataset for historical document image binarization from DIBCO 2017 contest, and a well known MNIST dataset for handwritten digits recognition. Our accelerator demonstrates highest accuracy and comparable throughput in comparison to state-of-the-art FPGA-based implementations of multilayer perceptron for MNIST dataset. Furthermore, we present a new DRAM-PIM architecture for 1D-LSTM targeting energy efficient compute platforms such as portable devices. The DRAM-PIM architecture integrates the computation units in a close proximity to the DRAM cells in order to maximize the data parallelism and energy efficiency. The proposed DRAM-PIM design is 16.19 × more energy efficient as compared to FPGA implementation. The total chip area overhead of this design is 18 % compared to a commodity 8 Gb DRAM chip. Our experiments show that the DRAM-PIM implementation delivers a throughput of 1309.16 GOp/s for an optical character recognition application.

show abstract

A Lean, Low Power, Low Latency DRAM Memory Controller for Transprecision Computing

Sudarshan

Lappas

Weis

et al. 2019

View full text Add to dashboard Cite

An In-DRAM Neural Network Processing Engine

Sudarshan

Lappas

Ghaffar

et al. 2019

View full text Add to dashboard Cite

A Novel DRAM Architecture for Improved Bandwidth Utilization and Latency Reduction Using Dual-Page Operation

Sudarshan

Steiner

Jung

et al. 2021

IEEE Trans. Circuits Syst. II

View full text Add to dashboard Cite

Correction to: Efficient Hardware Architectures for 1D- and MD-LSTM Networks

Rybalkin

Sudarshan

Weis

et al. 2021

J Sign Process Syst

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jan Lappas

Efficient Hardware Architectures for 1D- and MD-LSTM Networks

A Lean, Low Power, Low Latency DRAM Memory Controller for Transprecision Computing

An In-DRAM Neural Network Processing Engine

A Novel DRAM Architecture for Improved Bandwidth Utilization and Latency Reduction Using Dual-Page Operation

Correction to: Efficient Hardware Architectures for 1D- and MD-LSTM Networks

Contact Info

Product

Resources

About