In this paper, we present a hardware-software implementation of a deep neural network for object detection based on a point cloud obtained by a LiDAR sensor. The PointPillars network was used in the research, as it is a reasonable compromise between detection accuracy and calculation complexity. The Brevitas / PyTorch tools were used for network quantisation (described in our previous paper) and the FINN tool for hardware implementation in the reprogrammable Zynq UltraScale+ MPSoC device. The obtained results show that quite a significant computation precision limitation along with a few network architecture simplifications allows the solution to be implemented on a heterogeneous embedded platform with maximum 19% AP loss in 3D, maximum 8% AP loss in BEV and execution time 375ms (the FPGA part takes 262ms). We have also compared our solution in terms of inference speed with a Vitis AI implementation proposed by Xilinx (19 Hz frame rate). Especially, we have thoroughly investigated the fundamental causes of differences in the frame rate of both solutions. The code is available at https://github.com/vision-agh/pp-finn.
In this paper we present our research on the optimisation of a deep
neural network for 3D object detection in a point cloud. Techniques like
quantisation and pruning available in the Brevitas and PyTorch tools
were used. We performed the experiments for the PointPillars network,
which offers a reasonable compromise between detection accuracy and
calculation complexity. The aim of this work was to propose a variant of
the network which we will ultimately implement in an FPGA device. This
will allow for real-time LiDAR data processing with low energy
consumption. The obtained results indicate that even a significant
quantisation from 32-bit floating point to 2-bit integer in the main
part of the algorithm, results in 5%-9% decrease of the detection
accuracy, while allowing for almost a 16-fold reduction in size of the
model.
Streszczenie: Systemy wizyjne to dynamicznie rozwijająca się dziedzina nauki i techniki. W niniejszym artykule przedstawiono przegląd zagadnień, nad którymi pracuje zespół Laboratorium Systemów Wizyjnych. Obejmuje on systemy wizyjne dla pojazdów autonomicznych, aplikacje analizy obrazów medycznych, systemy wspomagania szkolenia pracowników, analizę obrazów termowizyjnych, implementacje i akceleracje algorytmów wizyjnych w układach SoC FPGA i GPU, przetwarzanie strumienia wizyjnego w rozdzielczości 4K, a także analizę sygnałów z wykorzystaniem głębokich sieci neuronowych i współpracę przy eksperymencie ALICE w CERN-ie.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.