Design space exploration for a hardware-accelerated embedded real-time pose estimation using vivado HLS

Joseph, Jan Moritz; Mey, Morten; Ehlers, Kristian; Blochwitz, Christopher; Winker, Tobias; Pionteck, Thilo

doi:10.1109/reconfig.2017.8279785

Cited by 4 publications

(4 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, if this is compared to our implementation on an Ubuntu i7 platform (without the application of the rules that increase robustness), our software acceleration method achieves a latency less than 40% of the lowest latency achieved in [ 1 ]. The face alignment applications ([ 8 , 9 , 10 , 12 ]) based on ERTs [ 14 ] achieve a relatively high speed (between 16 and 45 fps) but they concern different applications such as face recognition, pose estimation, etc., and some of them (e.g., [ 9 ]) align a smaller number of landmarks, which is a faster procedure. The yawning detection approaches [ 30 , 32 ] are based on CNNs and operate at a significantly smaller speed.…”

Section: Discussionmentioning

confidence: 99%

“…The authors of [ 8 ] implement a face recognition algorithm using a Xilinx platform and achieve a processing speed of 45 frames-per-second (fps). In [ 9 ], an algorithm is presented that can be executed on an embedded platform (Xilinx FPGA based on ARM A9 processor) that estimates the pose of the hand using 23 landmark points reporting a 30 fps rate. In [ 10 ], J. Goenetxea et al developed a 3D face model tracking application using 68 landmarks achieving a rate of approximately 30 fps.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A High Performance and Robust FPGA Implementation of a Driver State Monitoring Application

Christakos

Petrellis

Mousouliotis

et al. 2023

Sensors

View full text Add to dashboard Cite

A high-performance Driver State Monitoring (DSM) application for the detection of driver drowsiness is presented in this paper. The popular Ensemble of Regression Trees (ERTs) machine learning method has been employed for the alignment of 68 facial landmarks. Open-source implementation of ERTs for facial shape alignment has been ported to different platforms and adapted for the acceleration of the frame processing speed using reconfigurable hardware. Reducing the frame processing latency saves time that can be used to apply frame-to-frame facial shape coherency rules. False face detection and false shape estimations can be ignored for higher robustness and accuracy in the operation of the DSM application without sacrificing the frame processing rate that can reach 65 frames per second. The sensitivity and precision in yawning recognition can reach 93% and 97%, respectively. The implementation of the employed DSM algorithm in reconfigurable hardware is challenging since the kernel arguments require large data transfers and the degree of data reuse in the computational kernel is low. Hence, unconventional hardware acceleration techniques have been employed that can also be useful for the acceleration of several other machine learning applications that require large data transfers to their kernels with low reusability.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

A High Performance and Robust FPGA Implementation of a Driver State Monitoring Application

Christakos

Petrellis

Mousouliotis

et al. 2023

Sensors

View full text Add to dashboard Cite

show abstract

“…From the comparison presented in Table 4, it is obvious that the achieved frame processing speed is much higher than the related approaches. More specifically, the face alignment applications ([5], [6], [7], [9]) based on [11] achieve a relatively high speed but they concern different applications such as face recognition, pose estimation, etc., and some of them (e.g., [6]) align a smaller number of landmarks, thus the latency is also lower. The yawning detection approaches [26], [28] are based on CNNs and operate at a significantly smaller speed.…”

Section: Discussionmentioning

confidence: 99%

“…The authors of [5] implement a face recognition algorithm using a Xilinx platform and achieve a processing speed of 45 frames-per-second (fps). In [6], an algorithm is presented that can be executed on an embedded platform (Xilinx FPGA based on ARM A9 processor) that estimates the pose of the hand using 23 landmark points reporting a 30fps rate. In [7], J. Goenetxea et al developed a 3D face model tracking application using 68 landmarks achieving a rate of approximately 30fps.…”

Section: Introductionmentioning

confidence: 99%

A High Performance and Robust FPGA Implementation of a Driver State Monitoring Application

Christakos¹,

Petrellis²,

Mousouliotis³

et al. 2023

Preprint

View full text Add to dashboard Cite

A high performance Driver State Monitoring (DSM) application for the detection of driver drowsiness is presented in this paper. It relies on the usage of an Ensemble of Regression Trees (ERTs) machine learning method that aligns 68 facial landmarks. Special focus is given on the acceleration of the frame processing using reconfigurable hardware. Reducing the frame processing latency saves time that can be used to apply frame-to-frame facial shape coherency rules. False face detection and false shape estimations can be ignored for higher robustness and accuracy in the operation of the DSM application without reducing the frame processing rate that can reach 65 frames per second. The sensitivity and precision in yawning recognition can reach 93% and 97%, respectively. The implementation of the employed DSM algorithm in reconfigurable hardware is challenging since the kernel arguments require large data transfers and the degree of data reuse in the computational kernel is low. Due to this, unconventional hardware acceleration techniques have been employed that can also be useful for the acceleration of several other applications.

show abstract

Eye Tracker Acceleration in Reconfigurable Hybrid Systems

Roberto

Molina

Petrino

2018

2018 IEEE Biennial Congress of Argentina (ARGENCON)

View full text Add to dashboard Cite

Design space exploration for a hardware-accelerated embedded real-time pose estimation using vivado HLS

Cited by 4 publications

References 7 publications

A High Performance and Robust FPGA Implementation of a Driver State Monitoring Application

A High Performance and Robust FPGA Implementation of a Driver State Monitoring Application

A High Performance and Robust FPGA Implementation of a Driver State Monitoring Application

Eye Tracker Acceleration in Reconfigurable Hybrid Systems

Contact Info

Product

Resources

About