2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) 2021
DOI: 10.1109/fccm51124.2021.00032
|View full text |Cite
|
Sign up to set email alerts
|

Extending High-Level Synthesis for Task-Parallel Programs

Abstract: C/C++/OpenCL-based high-level synthesis (HLS) becomes more and more popular for eld-programmable gate array (FPGA) accelerators in many application domains in recent years, thanks to its competitive quality of result (QoR) and short development cycle compared with the traditional register-transfer level (RTL) design approach. Yet, limited by the sequential C semantics, it remains challenging to adopt the same highly productive high-level programming approach in many other application domains, where coarse-grai… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 19 publications
(3 citation statements)
references
References 63 publications
0
3
0
Order By: Relevance
“…The 'max depth' of the arrays listed in Table 1 was assigned the maximum value required among the benchmarks: V = 2,400, C = 630K, UCB size = 32K, O = 120K, and K = 64. FYalSAT has been programmed in C++, and it was synthesized with TAPA [33] and AMD/Xilinx's Vitis HLS 2022.2 [19]. The generated FPGA bitstream was tested on the Alveo U250 platform [34].…”
Section: Evaluation a Experimental Setupmentioning
confidence: 99%
“…The 'max depth' of the arrays listed in Table 1 was assigned the maximum value required among the benchmarks: V = 2,400, C = 630K, UCB size = 32K, O = 120K, and K = 64. FYalSAT has been programmed in C++, and it was synthesized with TAPA [33] and AMD/Xilinx's Vitis HLS 2022.2 [19]. The generated FPGA bitstream was tested on the Alveo U250 platform [34].…”
Section: Evaluation a Experimental Setupmentioning
confidence: 99%
“…A holistic Task Scheduling solution is presented in [42], where a HW task scheduler with the ability to drive CPUs, GPUs, and FPGAs is described. Other approaches use Task Scheduling program representations to automatically synthesize equivalent hardware [43] or configure dataflow systems [44]. These solutions offer substantial energy and latency advantages over ordinary CPU or GPU execution, but lack the versatility that these baselines or our proposal offer.…”
Section: Related Workmentioning
confidence: 99%
“…Our Dataflow cache is actually an example of cyclic dataflow graph. Fine Licht et al [16] and Chi et al [17] The tasks communicate and synchronize through FIFO queues. The request FIFO, which flows from Master to Slave, contains the inputs to the Slave operation (e.g., if the operation is a read access from an off-chip memory, it contains the address to be read).…”
Section: A Cyclic Dataflow Protocolmentioning
confidence: 99%