2017
DOI: 10.1145/3053679
|View full text |Cite
|
Sign up to set email alerts
|

Microarchitectural Comparison of the MXP and Octavo Soft-Processor FPGA Overlays

Abstract: Field-Programmable Gate Arrays (FPGAs) can yield higher performance and lower power than software solutions on CPUs or GPUs. However, designing with FPGAs requires specialized hardware design skills and hours-long CAD processing times. To reduce and accelerate the design effort, we can implement an overlay architecture on the FPGA, on which we then more easily construct the desired system but at a large cost in performance and area relative to a direct FPGA implementation. In this work, we compare the micro-ar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 26 publications
0
7
0
Order By: Relevance
“…Most successful TM overlays are based on soft processors. The more performance oriented ones include, SIMD Octavo [13], VectorBlox MXP [24] and VLIW TILT [19]. A massively parallel overlay, called GRVI Phalanx [7], based on the RISC-V processor and the Hoplite NOC [11] mapped 1680 RISC-V cores onto an UltraScale+ VU9P.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Most successful TM overlays are based on soft processors. The more performance oriented ones include, SIMD Octavo [13], VectorBlox MXP [24] and VLIW TILT [19]. A massively parallel overlay, called GRVI Phalanx [7], based on the RISC-V processor and the Hoplite NOC [11] mapped 1680 RISC-V cores onto an UltraScale+ VU9P.…”
Section: Related Workmentioning
confidence: 99%
“…NOPs (equal to IWP-1) must be added between dependant instructions (DFG nodes) unless other non-dependant instructions can be scheduled in between. For example, in the first (top) cluster, Node 17 is scheduled, followed by 13,25,9,20, and 12, before 15 is scheduled. Hence, the dependency between 17 and 15 is resolved and no NOPs are inserted.…”
Section: Compiling To the Overlaymentioning
confidence: 99%
“…To improve power consumption and throughput, smaller and faster processor architectures, such as the iDEA processor [22], have been proposed. Examples of multi-threaded and parallel processors include: CUSTARD [23], Octavo [24] and SIMD-Octavo [25], The VectorBlox MXP soft vector processor [26] and the TILT VLIW processor [27].…”
Section: B Time-multiplexed Overlaysmentioning
confidence: 99%
“…FPGA overlay architectures [13], [14], [15], [16], [17] built around runtime programmable hardware blocks have emerged as one possible solution to this challenge, offering improved design productivity, by virtue of fast compilation, software-like programmability and run-time management, and high-level design abstraction. Runtime programmable hardware blocks may include (soft) processor arrays [18], [19], DMA engines, SIMD/VLIW engines [20], [21], programmable data-flow engines [22], [23], [24], [25], or Network-on-Chip (NoC) nodes [26].…”
Section: Introductionmentioning
confidence: 99%