2020
DOI: 10.20944/preprints202007.0506.v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Darknet on OpenCL: A Multi-platform Tool for Object Detection and Classification

Abstract: The article’s goal is to overview challenges and problems on the way from the state of the art CUDA accelerated neural networks code to multi-GPU code. For this purpose, the authors describe the journey of porting the existing in the GitHub, fully-featured CUDA accelerated Darknet engine to OpenCL. The article presents lessons learned and the techniques that were put in place to make this port happen. There are few other implementations on the GitHub that leverage the OpenCL standard, and a few have … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(9 citation statements)
references
References 29 publications
0
9
0
Order By: Relevance
“…To implement our proposed method, we modified the Darknet-OpenCL framework [ 43 ], which is an open-source framework ported from Darknet [ 44 ] using CUDA [ 45 ] to OpenCL [ 46 ]. To accelerate matrix multiplication, which takes up most of the time in deep learning operations, we modified the basic linear algebra subprograms (BLAS) library of Darknet-OpenCL, OpenBLAS [ 47 ] for CPU, and CLBlast [ 48 ] for GPU.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…To implement our proposed method, we modified the Darknet-OpenCL framework [ 43 ], which is an open-source framework ported from Darknet [ 44 ] using CUDA [ 45 ] to OpenCL [ 46 ]. To accelerate matrix multiplication, which takes up most of the time in deep learning operations, we modified the basic linear algebra subprograms (BLAS) library of Darknet-OpenCL, OpenBLAS [ 47 ] for CPU, and CLBlast [ 48 ] for GPU.…”
Section: Methodsmentioning
confidence: 99%
“…The number of epochs was set to 300. Our experiments were conducted using the Darknet-OpenCL [ 43 ] framework.…”
Section: Methodsmentioning
confidence: 99%
“…F I G U R E 2 Multi-GPU computing monitor state example 14 F I G U R E 3 YOLO2 training process step example It works very well and improves execution in the most nested loop by a number of tuning value that is computed dynamically by dividing the "filters" variable by "4". The last important information is that parameter "t" cannot be correctly checked in the conditions or printed out.…”
Section: Listing 2: Cpu Run Time Assert Protection Examplementioning
confidence: 99%
“…Therefore, an OpenCLbased GPU-accelerated library needs to be linked for deep learning frameworks to be efficiently executed in embedded systems. OpenCL Caffe [40], DeepCL [41], TensorFlow Lite [42], and Darknet on OpenCL [43] are deep learning frameworks that support OpenCLbased GPU-accelerated libraries at present.…”
Section: Deep Learning Frameworkmentioning
confidence: 99%