2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops 2014
DOI: 10.1109/cvprw.2014.106
|View full text |Cite
|
Sign up to set email alerts
|

A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
120
1

Year Published

2015
2015
2021
2021

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 259 publications
(121 citation statements)
references
References 13 publications
0
120
1
Order By: Relevance
“…Our implementation running on the Tegra K1 or on the [6,7,21]. Results for system (dark) and differential power (light).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Our implementation running on the Tegra K1 or on the [6,7,21]. Results for system (dark) and differential power (light).…”
Section: Resultsmentioning
confidence: 99%
“…A growing number of researchers are proposing to address the recognition of actions and objects with brain-inspired algorithms featuring multi-stage feature detectors and classifiers which can be customized using machine learning [6,7,11]. These techniques, collectively known as deep learning, have recently achieved record-breaking results on highly challenging datasets using automatic (supervised or partially unsupervised) learning.…”
Section: Introductionmentioning
confidence: 99%
“…One example of previous work that implement weight stationary dataflow is nn-X, or neuFlow [85], which uses eight 2-D convolution engines for processing a 10×10 filter. There are total 100 MAC units, i.e.…”
Section: B Energy-efficient Dataflow For Acceleratorsmentioning
confidence: 99%
“…fpgaConvNet provides support for fixed-point as well as single-and doubleprecision floating-point representation. In the evaluation phase, Q8.8 fixed-point representation was used which is also used in the FPGA works that we compare with and has been extensively tested in the literature to give similar results to neural networks implemented in 32-bit floating-point [6].…”
Section: Discussionmentioning
confidence: 99%
“…By targeting the larger Xilinx Virtex-6 VLX240T FPGA, NeuFlow achieved 147 GOp/s at 10W. Finally, in 2014, the design was ported to Xilinx Zynq XC7045 SoC under the name nn-X [6] where it achieved 200 GOp/s at 4W. Nevertheless, systolic implementations suffer from complex routing logic and can support convolutions only up to the maximum implemented kernel size, e.g.…”
Section: Related Workmentioning
confidence: 99%