“…Over the past 15 years, many CGRA processors with different architectures and execution modes have been proposed [4,5,11,13,16,17,29,31,39,43,46]. In this work, we focus on CGRAs that execute modulo-scheduled loop kernels and operate in dataflow mode [5,31,46]. The dataflow graph (DFG) of a loop kernel is mapped onto such CGRAs in the form of a modulo schedule, a variant of a software-pipelined loop.…”