CHiMPS: A C-level compilation flow for hybrid CPU-FPGA architectures

Putnam, Andrew; Bennett, Dave; Dellinger, Eric; Mason, Jeff; Sundararajan, Prasanna; Eggers, Susan J.

doi:10.1109/fpl.2008.4629927

Cited by 50 publications

(35 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The CHiMPS compiler [38] targets applications for highperformance. The distinctive feature of CHiMPS is its manycache, which is a hardware model that adapts the hundreds of small, independent FPGA memories to the specific memory needs of an application.…”

Section: A Academic Hls Tools Evaluated In This Studymentioning

confidence: 99%

A Survey and Evaluation of FPGA High-Level Synthesis Tools

Nane

Sima

Pilato

et al. 2016

IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.

453

182

View full text Add to dashboard Cite

Abstract-High-level synthesis (HLS) is increasingly popular for the design of high-performance and energy-efficient heterogeneous systems, shortening time-to-market and addressing today's system complexity. HLS allows designers to work at a higher-level of abstraction by using a software program to specify the hardware functionality. Additionally, HLS is particularly interesting for designing FPGA circuits, where hardware implementations can be easily refined and replaced in the target device. Recent years have seen much activity in the HLS research community, with a plethora of HLS tool offerings, from both industry and academia. All these tools may have different input languages, perform different internal optimizations, and produce results of different quality, even for the very same input description. Hence, it is challenging to compare their performance and understand which is the best for the hardware to be implemented. We present a comprehensive analysis of recent HLS tools, as well as overview the areas of active interest in the HLS research community. We also present a first-published methodology to evaluate different HLS tools. We use our methodology to compare one commercial and three academic tools on a common set of C benchmarks, aiming at performing an in-depth evaluation in terms of performance and use of resources.

show abstract

Section: A Academic Hls Tools Evaluated In This Studymentioning

confidence: 99%

A Survey and Evaluation of FPGA High-Level Synthesis Tools

Nane

Sima

Pilato

et al. 2016

IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.

453

182

View full text Add to dashboard Cite

show abstract

“…Delft No Trident [38] Los Alamos NL No CHiMPS [35] U. Washington No Kiwi [18] U. Cambridge Yes gcc2verilog [21] U. Korea No HercuLeS [27] Ajax Compiler No a Refers to array partitioning as part of the tool flow, the arrays can always be re-written into multiple partitions by the software designer.…”

Section: Memory Partitioning In Hlsmentioning

confidence: 99%

Automated generation of banked memory architectures in the high-level synthesis of multi-threaded software

Chen

Anderson

2017

2017 27th International Conference on Field Programmable Logic and Applications (FPL)

View full text Add to dashboard Cite

The Legup High-Level Synthesis (HLS) tool permits the synthesis of multi-threaded software into parallel hardware, where parallel software threads are realized as concurrently operating hardware units. A common performance bottleneck in any parallel implementation is memory bandwidth -parallel threads demand concurrent access to memory resulting in contention that hurts performance. FPGAs contain an abundance of independently accessible memories offering high internal memory bandwidth.We describe an approach for leveraging such bandwidth in the context of synthesizing parallel software into hardware. Our approach applies trace-based profiling to determine how a program's arrays should be automatically partitioned into sub-arrays, which are then implemented in separate RAM blocks within the target FPGA. The end result is that each thread, when implemented in hardware, has exclusive access to its own memories to the extent possible, significantly reducing contention and arbitration and thus raising performance.ii

show abstract

“…[19] used distributed caches, but did not address coherency at all. This was treated in [23], but with more complex hardware (due to combined read/write ports), and lack of separate coherence clusters and cache-to-cache transfers.…”

Section: Related Workmentioning

confidence: 99%

“…No explicit coherency management between them is required. Potentially dependent accesses are assigned to Cache Ports in the same Cluster, with explicit coherency mechanisms (in contrast to, e.g., [19], which provided only incoherent caches).…”

Section: Coherency Mechanismsmentioning

confidence: 99%

MARC II: A parametrized speculative multi-ported memory subsystem for reconfigurable computers

Lange

Wink

Koch

2011

2011 Design, Automation &Amp; Test in Europe

View full text Add to dashboard Cite

Abstract-We describe a parameterized memory system suitable as target for automatic high-level language to hardware compilers for reconfigurable computers. It fully supports the spatial computation paradigm by allowing the realization of each memory operator by a dedicated hardware memory port. Interport coherency is maintained only for those ports that actually require it, and efficient speculative execution is enabled by a dynamic scheme for arbitrating access to shared resources (such as main memory), relying on techniques inspired by the branch prediction of conventional software-programmable processors.

show abstract

CHiMPS: A C-level compilation flow for hybrid CPU-FPGA architectures

Cited by 50 publications

References 9 publications

A Survey and Evaluation of FPGA High-Level Synthesis Tools

A Survey and Evaluation of FPGA High-Level Synthesis Tools

Automated generation of banked memory architectures in the high-level synthesis of multi-threaded software

MARC II: A parametrized speculative multi-ported memory subsystem for reconfigurable computers

Contact Info

Product

Resources

About