High‐Level Synthesis: Productivity, Performance, and Software Constraints

Liang, Yun; Rupnow, Kyle; Li, Yinan; Min, Dongbo; Minh, N.; Chen, Deming

doi:10.1155/2012/649057

Cited by 65 publications

(31 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…For example, in the cases of Rupnow et al [33] and Liang et al [34], the performance difference between the two is up to 40× for a high-definition stereo matching implementation. Consequently, various academics proposed different solutions to this issue [35].…”

Section: Algorithm 1 Pseudocode Of the Original Implementationmentioning

confidence: 99%

Embedded Implementation of VHR Satellite Image Segmentation

2017

Architecture‐Aware Optimization Strategies in Real‐time Image Processing

View full text Add to dashboard Cite

Processing and analysis of Very High Resolution (VHR) satellite images provide a mass of crucial information, which can be used for urban planning, security issues or environmental monitoring. However, they are computationally expensive and, thus, time consuming, while some of the applications, such as natural disaster monitoring and prevention, require high efficiency performance. Fortunately, parallel computing techniques and embedded systems have made great progress in recent years, and a series of massively parallel image processing devices, such as digital signal processors or Field Programmable Gate Arrays (FPGAs), have been made available to engineers at a very convenient price and demonstrate significant advantages in terms of running-cost, embeddability, power consumption flexibility, etc. In this work, we designed a texture region segmentation method for very high resolution satellite images by using the level set algorithm and the multi-kernel theory in a high-abstraction C environment and realize its register-transfer level implementation with the help of a new proposed high-level synthesis-based design flow. The evaluation experiments demonstrate that the proposed design can produce high quality image segmentation with a significant running-cost advantage.

show abstract

Section: Algorithm 1 Pseudocode Of the Original Implementationmentioning

confidence: 99%

Embedded Implementation of VHR Satellite Image Segmentation

2017

Architecture‐Aware Optimization Strategies in Real‐time Image Processing

View full text Add to dashboard Cite

show abstract

“…With such powerful optimizations, HLS offers increased productivity with lower design effort; however, in practice these transformations are difficult to apply -only certain data access patterns are supported, limiting the applicability of an important HLS feature. Recent studies show that there is still a significant performance gap between manual design and HLS-generated designs [23,17,10], and the inability to apply these optimizations is one of the causes of this gap.…”

Section: Related Workmentioning

confidence: 99%

“…Stateof-the-art HLS tools cover a wide range of input source code and achieve high-quality results [8]. These tools have achieved significant improvement; however, recent studies show that although these tools can offer high quality designs for small kernels, there is still a significant performance gap between HLS and manual design for real-world complex applications [23,17,10]. For example, [23,17] demonstrated a 40X difference between HLS and the manual design for a high-definition stereo matching implementation.…”

Section: Introductionmentioning

confidence: 99%

“…These tools have achieved significant improvement; however, recent studies show that although these tools can offer high quality designs for small kernels, there is still a significant performance gap between HLS and manual design for real-world complex applications [23,17,10]. For example, [23,17] demonstrated a 40X difference between HLS and the manual design for a high-definition stereo matching implementation. Small kernels (often used as simple HLS benchmarks) contain a single block (a loop nest), but realworld applications often contain many data-dependent blocks that communicate through complex data access patterns.…”

Section: Introductionmentioning

confidence: 99%

“…As seen in efficient manual RTL implementations of these algorithms, it is crucial to minimize the communication granularity, pipeline the data-dependent blocks, and duplicate compute units to improve both throughput and latency. However, existing HLS tools fail to enable intra-block parallelization and inter-block pipelining when the data access patterns are complex [23,17] and although advanced commercial tools support these optimizations, they are not always able to efficiently identify the opportunity and transform source code to enable such optimizations.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Improving high level synthesis optimization opportunity through polyhedral transformations

Zuo

Liang

et al. 2013

Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays

Self Cite

View full text Add to dashboard Cite

High level synthesis (HLS) is an important enabling technology for the adoption of hardware accelerator technologies. It promises the performance and energy efficiency of hardware designs with a lower barrier to entry in design expertise, and shorter design time. State-of-the-art high level synthesis now includes a wide variety of powerful optimizations that implement efficient hardware. These optimizations can implement some of the most important features generally performed in manual designs including parallel hardware units, pipelining of execution both within a hardware unit and between units, and fine-grained data communication. We may generally classify the optimizations as those that optimize hardware implementation within a code block (intra-block) and those that optimize communication and pipelining between code blocks (interblock). However, both optimizations are in practice difficult to apply. Real-world applications contain data-dependent blocks of code and communicate through complex data access patterns. Existing high level synthesis tools cannot apply these powerful optimizations unless the code is inherently compatible, severely limiting the optimization opportunity.In this paper we present an integrated framework to model and enable both intra-and inter-block optimizations. This integrated technique substantially improves the opportunity to use the powerful HLS optimizations that implement parallelism, pipelining, and fine-grained communication. Our polyhedral model-based technique systematically defines a set of data access patterns, identifies effective data access patterns, and performs the loop transformations to enable the intra-and inter-block optimizations. Our framework automatically explores transformation options, performs code transformations, and inserts the appropriate HLS directives to implement the HLS optimizations. Furthermore, our framework can automatically generate the optimized communication blocks for fine-grained communication between hardware blocks. Experimen- * Corresponding Author Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. FPGA '13, February 11-13, 2013, Monterey, California, USA. Copyright 2013 ACM 978-1-4503-1887 tal evaluation demonstrates that we can achieve an average of 6.04X speedup over the high level synthesis solution without our transformations to enable intra-and inter-block optimizations.

show abstract

Hardware Benchmarking of Cryptographic Algorithms Using High-Level Synthesis Tools: The SHA-3 Contest Case Study

Homsirikamol

Gaj

2015

Lecture Notes in Computer Science

View full text Add to dashboard Cite

High‐Level Synthesis: Productivity, Performance, and Software Constraints

Cited by 65 publications

References 31 publications

Embedded Implementation of VHR Satellite Image Segmentation

Embedded Implementation of VHR Satellite Image Segmentation

Improving high level synthesis optimization opportunity through polyhedral transformations

Hardware Benchmarking of Cryptographic Algorithms Using High-Level Synthesis Tools: The SHA-3 Contest Case Study

Contact Info

Product

Resources

About