Generating FPGA-based image processing accelerators with Hipacc: (Invited paper)

Reiche, Oliver; Ozkan, M. Akif; Membarth, Richard; Teich, Jürgen; Hannig, Frank

doi:10.1109/iccad.2017.8203894

Cited by 24 publications

(15 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Heterogeneous Image Processing Acceleration (HIPA cc ) framework [97] is shown in FIGURE 16. It is a DSL and source-to-source compiler that supports C/C++, CUDA, OpenCL, Renderscript, and HLS-friendly C/C++, which is able to produce low-level code for image processing kernels on a wide range of GPUs, CPUs and FPGAs.…”

Section: ) Hipa CC Dslmentioning

confidence: 99%

Towards Automatic High-Level Code Deployment on Reconfigurable Platforms: A Survey of High-Level Synthesis Tools and Toolchains

et al. 2020

View full text Add to dashboard Cite

Heterogeneous computing systems with tightly coupled processors and reconfigurable logic blocks provide great scope to improve software performance by executing each section of code on the processor or custom hardware accelerator that best matches its requirements and the system optimisation goals. This paper is motivated by the idea of a software tool that can automatically accomplish the task of deploying code, originally written for a conventional computer, to the processors and reconfigurable logic blocks in a heterogeneous system. We undertake an extensive survey of high-level synthesis tools to determine how close we are to this vision, and to identify any capability gaps. The survey is structured according to a new framework that clearly expresses the relationships between the many tools surveyed. We find that none of the existing tools can deploy general high-level code without manual intervention. Logic synthesis from arbitrary high-level code remains an open problem with dynamic data structures, function pointers and recursion all presenting challenges. Other challenges include automating the tasks of code partitioning, optimisation and design space exploration.

show abstract

Section: ) Hipa CC Dslmentioning

confidence: 99%

Towards Automatic High-Level Code Deployment on Reconfigurable Platforms: A Survey of High-Level Synthesis Tools and Toolchains

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Unlike the prior work, [24] suggests static OpenVX compilation for low-power embedded systems instead of runtime-library implementations. Our work is similar to this since we statically analyze a given OpenVX application and combine the benefits of domain-specific code generation approaches [3,8,10,14,16,19]. Halide [16], Hipacc [8], and PolyMage [10] are image processing DSLs that provide language constructs and scheduling primitives to generate code that is optimized for the target device, i.e., CPUs, GPUs.…”

Section: Related Workmentioning

confidence: 99%

“…CAPH [20], RIPL [23], and Rigel [6] are image processing DSLs that generate optimized code for FPGAs. Hipacc-FPGA [19] supports HLS tools of both Xilinx and Intel, while Halide-HLS [14], PolyMage-HLS [3], and RIPL only target Xilinx devices. CAPH relies upon the actor/dataflow model of computation to generate VHDL or SystemC code.…”

Section: Related Workmentioning

confidence: 99%

HipaccVX: wedding of OpenVX and DSL-based code generation

Ozkan

Qiao

et al. 2020

J Real-Time Image Proc

Self Cite

View full text Add to dashboard Cite

Writing programs for heterogeneous platforms optimized for high performance is hard since this requires the code to be tuned at a low level with architecture-specific optimizations that are most times based on fundamentally differing programming paradigms and languages. OpenVX promises to solve this issue for computer vision applications with a royalty-free industry standard that is based on a graph-execution model. Yet, the OpenVX ’ algorithm space is constrained to a small set of vision functions. This hinders accelerating computations that are not included in the standard. In this paper, we analyze OpenVX vision functions to find an orthogonal set of computational abstractions. Based on these abstractions, we couple an existing domain-specific language (DSL) back end to the OpenVX environment and provide language constructs to the programmer for the definition of user-defined nodes. In this way, we enable optimizations that are not possible to detect with OpenVX graph implementations using the standard computer vision functions. These optimizations can double the throughput on an Nvidia GTX GPU and decrease the resource usage of a Xilinx Zynq FPGA by 50% for our benchmarks. Finally, we show that our proposed compiler framework, called HipaccVX, can achieve better results than the state-of-the-art approaches Nvidia VisionWorks and Halide-HLS.

show abstract

“…H OMOGENEOUS general purpose processors provide flexibility to implement a variety of applications and facilitate programmability. However, these platforms cannot take advantage of the domain knowledge to optimize the energy efficiency for specific application domains, such as machine learning, communication protocols, and autonomous driving [1], [2], [3]. In contrast, heterogeneous systems-on-chip (SoCs) that combine general purpose and specialized processors (e.g., audio/video codecs and communication modems) offer great potential to achieve higher efficiency [4].…”

Section: Introductionmentioning

confidence: 99%

DS3: A System-Level Domain-Specific System-on-Chip Simulation Framework

Arda

Krishnakumar

Goksoy

et al. 2020

IEEE Trans. Comput.

View full text Add to dashboard Cite

Heterogeneous systems-on-chip (SoCs) are highly favorable computing platforms due to their superior performance and energy efficiency potential compared to homogeneous architectures. They can be further tailored to a specific domain of applications by incorporating processing elements (PEs) that accelerate frequently used kernels in these applications. However, this potential is contingent upon optimizing the SoC for the target domain and utilizing its resources effectively at runtime. To this end, system-level design -including scheduling, power-thermal management algorithms and design space exploration studies -plays a crucial role. This paper presents a system-level domain-specific SoC simulation (DS3) framework to address this need. DS3 enables both design space exploration and dynamic resource management for power-performance optimization of domain applications. We showcase DS3 using six real-world applications from wireless communications and radar processing domain. DS3, as well as the reference applications, is shared as open-source software to stimulate research in this area.

show abstract

Generating FPGA-based image processing accelerators with Hipacc: (Invited paper)

Cited by 24 publications

References 14 publications

Towards Automatic High-Level Code Deployment on Reconfigurable Platforms: A Survey of High-Level Synthesis Tools and Toolchains

Towards Automatic High-Level Code Deployment on Reconfigurable Platforms: A Survey of High-Level Synthesis Tools and Toolchains

HipaccVX: wedding of OpenVX and DSL-based code generation

DS3: A System-Level Domain-Specific System-on-Chip Simulation Framework

Contact Info

Product

Resources

About