Optimizing Affine Control With Semantic Factorizations

Alias, Christophe; Plesco, Alexandru

doi:10.1145/3162017

Cited by 2 publications

(3 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…With the PPN partitioning strategy, it is possible to factor the channels [43] and the processes which share a lot of common control. We also proposed a back-end algorithm to compact affine control [6]. Section 6 will outline our back-end.…”

Section: Compilation Methodologymentioning

confidence: 99%

See 1 more Smart Citation

Data-aware process networks

Alias

Plesco²

2021

Proceedings of the 30th ACM SIGPLAN International Conference on Compiler Construction

Self Cite

View full text Add to dashboard Cite

With the emergence of reconfigurable FPGA circuits as a credible alternative to GPUs for HPC acceleration, new compilation paradigms are required to map high-level algorithmic descriptions to a circuit configuration (High-Level Synthesis, HLS). In particular, novel parallelization algorithms and intermediate representations are required. In this paper, we present the data-aware process networks (DPN), a dataflow intermediate representation suitable for HLS in the context of high-performance computing. DPN combines the benefits of a low-level dataflow representation-close to the final circuit-and affine iteration space tiling to explore the parallelization trade-offs (local memory size, communication volume, parallelization degree). We outline our compilation algorithms to map a C program to a DPN (front-end), then to map a DPN to an FPGA configuration (back-end). Finally, we present synthesis results on compute-intensive kernels from the Polybench suite. CCS Concepts: • Hardware → High-level and registertransfer level synthesis; • Theory of computation → Streaming models.

show abstract

Section: Compilation Methodologymentioning

confidence: 99%

“…The Inmux is responsible of fetching data from the Buffers and Demuxes to store the data into output Buffers. All these units make an extensive use of piece-wise affine functions, which are efficiently synthesized as DAGs, using the technique described in [6]. These units communicate over a long pipelined bus with channels.…”

Section: Back-endmentioning

confidence: 99%

Data-aware process networks

Alias

Plesco²

2021

Proceedings of the 30th ACM SIGPLAN International Conference on Compiler Construction

Self Cite

View full text Add to dashboard Cite

show abstract

“…The bigger is a parallel unit, the less it can be duplicated, thereby limiting the overall performance. Particularly, tricky program optimizations are likely to spoil the performances if the circuit is not post-optimized carefully [5]. An important consequence is that the the roofline model is not longer valid in HLS [8].…”

Section: Introductionmentioning

confidence: 99%

Improving Communication Patterns in Polyhedral Process Networks

Alias¹

2018

Preprint

Self Cite

View full text Add to dashboard Cite

Embedded system performances are bounded by power consumption. The trend is to offload greedy computations on hardware accelerators as GPU, Xeon Phi or FPGA. FPGA chips combine both flexibility of programmable chips and energy-efficiency of specialized hardware and appear as a natural solution. Hardware compilers from high-level languages (High-level synthesis, HLS) are required to exploit all the capabilities of FPGA while satisfying tight time-tomarket constraints. Compiler optimizations for parallelism and data locality restructure deeply the execution order of the processes, hence the read/write patterns in communication channels. This breaks most FIFO channels, which have to be implemented with addressable buffers. Expensive hardware is required to enforce synchronizations, which often results in dramatic performance loss. In this paper, we present an algorithm to partition the communications so that most FIFO channels can be recovered after a loop tiling, a key optimization for parallelism and data locality. Experimental results show a drastic improvement of FIFO detection for regular kernels at the cost of a few additional storage. As a bonus, the storage can even be reduced in some cases.

show abstract

Optimizing Affine Control With Semantic Factorizations

Cited by 2 publications

References 38 publications

Data-aware process networks

Data-aware process networks

Improving Communication Patterns in Polyhedral Process Networks

Contact Info

Product

Resources

About