Cédric Bourrasset scite author profile

Cédric Bourrasset

4Publications

83Citation Statements Received

49Citation Statements Given

How they've been cited

111

How they cite others

Affiliations

Atos (France), Institut Pascal, National Computer Center for Higher Education

Publications

Order By: Most citations

Tactics to Directly Map CNN Graphs on Embedded FPGAs

Abdelouahab

Pelcat

Sérot

et al. 2017

IEEE Embedded Syst. Lett.

View full text Add to dashboard Cite

Deep Convolutional Neural Networks (CNNs) are the state-of-the-art in image classification. Since CNN feed forward propagation involves highly regular parallel computation, it benefits from a significant speed-up when running on fine grain parallel programmable logic devices. As a consequence, several studies have proposed FPGA-based accelerators for CNNs. However, because of the large computational power required by CNNs, none of the previous studies has proposed a direct mapping of the CNN onto the physical resources of an FPGA, allocating each processing actor to its own hardware instance. In this paper, we demonstrate the feasibility of the so called direct hardware mapping (DHM) and discuss several tactics we explore to make DHM usable in practice. As a proof of concept, we introduce the HADDOC2 open source tool, that automatically transforms a CNN description into a synthesizable hardware description with platform-independent direct hardware mapping 1 .

show abstract

Design productivity of a high level synthesis compiler versus HDL

Pelcat

Bourrasset

Maggiani

et al. 2016

View full text Add to dashboard Cite

The complexity of hardware systems is currently growing faster than the productivity of system designers and programmers. This phenomenon is called Design Productivity Gap and results in inflating design costs.In this paper, the notion of Design Productivity is precisely defined, as well as a metric to assess the Design Productivity of a High-Level Synthesis (HLS) method versus a manual hardware description. The proposed Design Productivity metric evaluates the trade-off between design efficiency and implementation quality. The method is generic enough to be used for comparing several HLS methods of different natures, opening opportunities for further progress in Design Productivity.To demonstrate the Design Productivity evaluation method, an HLS compiler based on the CAPH language is compared to manual VHDL writing. The causes that make VHDL lower level than CAPH are discussed. Versions of the sub-pixel interpolation filter from the MPEG HEVC standard are implemented and a design productivity gain of 2.3× in average is measured for the CAPH HLS method. It results from an average gain in design time of 4.4× and an average loss in quality of 1.9×.

show abstract

High-level dataflow programming for real-time image processing on smart cameras

Sérot

Berry

Bourrasset

2014

J Real-Time Image Proc

View full text Add to dashboard Cite

A Holistic Approach for Optimizing DSP Block Utilization of a CNN implementation on FPGA

Abdelouahab

Bourrasset²,

Pelcat

et al. 2016

View full text Add to dashboard Cite

Deep Neural Networks are becoming the de-facto standard models for image understanding, and more generally for computer vision tasks. As they involve highly parallelizable computations, Convolutional Neural Networks (CNNs) are well suited to current fine grain programmable logic devices. Thus, multiple CNN accelerators have been successfully implemented on FPGAs. Unfortunately, Field-Programmable Gate Array (FPGA) resources such as logic elements or Digital Signal Processing (DSP) units remain limited. This work presents a holistic method relying on approximate computing and design space exploration to optimize the DSP block utilization of a CNN implementation on FPGA. This method was tested when implementing a reconfigurable Optical Character Recognition (OCR) convolutional neural network on an Altera Stratix V device and varying both data representation and CNN topology in order to find the best combination in terms of DSP block utilization and classification accuracy. This exploration generated dataflow architectures of 76 CNN topologies with 5 different fixed point representation. Most efficient implementation performs 883 classifications/sec at 256 × 256 resolution using 8 % of the available DSP blocks.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.