2019
DOI: 10.1109/access.2019.2907261
|View full text |Cite
|
Sign up to set email alerts
|

Deep Neural Network Hardware Implementation Based on Stacked Sparse Autoencoder

Abstract: Deep learning techniques have been gaining prominence in the research world in the past years; however, the deep learning algorithms have high computational cost, making them hard to be used to several commercial applications. On the other hand, new alternatives have been studied and some methodologies focusing on accelerating complex algorithms including those based on reconfigurable hardware has been showing significant results. Therefore, the objective of this paper is to propose a neural network hardware i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
41
0
3

Year Published

2019
2019
2023
2023

Publication Types

Select...
9
1

Relationship

3
7

Authors

Journals

citations
Cited by 37 publications
(44 citation statements)
references
References 20 publications
0
41
0
3
Order By: Relevance
“…This therefore allows applications to achieve real-time or near real-time processing. The FPGA allows the exploitation of the algorithm parallelization and the development of dedicated hardware to obtain performance improvement [9][10][11][12][13][14][15]. However, FPGA implementations found in the literature are often developed with sequential processing schemes in some stages of the Otsu algorithm, limiting the hardware's processing speed [16][17][18][19][20][21].…”
Section: Introductionmentioning
confidence: 99%
“…This therefore allows applications to achieve real-time or near real-time processing. The FPGA allows the exploitation of the algorithm parallelization and the development of dedicated hardware to obtain performance improvement [9][10][11][12][13][14][15]. However, FPGA implementations found in the literature are often developed with sequential processing schemes in some stages of the Otsu algorithm, limiting the hardware's processing speed [16][17][18][19][20][21].…”
Section: Introductionmentioning
confidence: 99%
“…However, the path to optimal implementation of DNN topologies on FPGAs remains complex, requiring expertise in several areas, DL algorithms and topologies, embedded and reconfigurable computing. Custom design can produce the best performance solutions, but it is an optimization that takes time and lacks flexibility [29], [35]. In this context, tools are available but mostly oriented towards mainframe applications, such as Intel Open Vino for Arria 10 GX [36] and Vitis-AI cards for Alveo or UltraScale available in collaborative environments such as Amazon Web Services EC2-F1 [37].…”
Section: Introductionmentioning
confidence: 99%
“…In [8] sparse autoencoder architecture with network architecture of 196 input and output neurons along with 100 hidden neurons is implemented using Verilog HDL. DNN hardware realization using a technique called SSAE was implemented in [9] which used a concept of a systolic array that allows the use of many neurons and various layers. With the help of related works, we propose an HT detection which provides better accuracy at faster processing time.…”
Section: Introductionmentioning
confidence: 99%