Memory Virtualization for Multithreaded Reconfigurable Hardware

A key enabler for the ever-increasing adoption of FPGA accelerators is the availability of frameworks allowing for the seamless coupling to general-purpose host processors. Embedded FPGA+CPU systems still heavily rely on copy-based host-to-accelerator communication, which complicates application development.In this paper, we present a hardware/software framework for enabling transparent, shared virtual memory for FPGA accelerators in embedded SoCs. It can use a hard-macro IOMMU if available, or a configurable soft-core IOMMU that we provide. We explore different TLB configurations and provide a comparison with other designs for shared virtual memory to gain insight on performance-critical IOMMU components. Experimental results using pointer-rich benchmarks show that our framework not only simplifies FPGA-accelerated application development, it also achieves up to 13x speedup compared to traditional copy-based offloading.

show abstract

“…Another approach relies on operating system support to enable SVM using per-thread hardware IOMMUs [28].…”

Section: Related Workmentioning

confidence: 99%

Exploring Shared Virtual Memory for FPGA Accelerators with a Configurable IOMMU

Vogel

Marongiu

Benini

2019

IEEE Trans. Comput.

View full text Add to dashboard Cite

show abstract

“…The MMU includes a translation lookaside buffer (TLB), which autonomously translates addresses using the Linux kernel's page tables [APL11].…”

Section: System-on-chip Architecturementioning

confidence: 99%

FPGAs for Software Programmers

Koch

Hannig

Ziener

2016

View full text Add to dashboard Cite

“…In ReconOS terms, a Tinuso-I core could be considered a hardware thread. ReconOS has been extended with transparent address translation in the ReconOS VM system [21].…”

Section: Related Workmentioning

confidence: 99%