“…Custom clusters are based on the concept of systolic array model in parallel computing architecture, where every node acts as a data processing unit and processed data move from one node to another through first-in first-out (FIFO) buffer or network semantics. Some of these architectures [115][116][117][118] use Peer to Peer (P2P) connection MaxRing, fast series transceivers with FIFO buffers, and Peripheral Component Interconnect Express (PCIe) links, for transmitting data across multiple nodes. Tailored designs allow the direct communication among the nodes through explicit network connections.…”
Modern datacenters are reinforcing the computational power and energy efficiency by assimilating field programmable gate arrays (FPGAs). The sustainability of this large-scale integration depends on enabling multi-tenant FPGAs. This requisite amplifies the importance of communication architecture and virtualization method with the required features in order to meet the high-end objective. Consequently, in the last decade, academia and industry proposed several virtualization techniques and hardware architectures for addressing resource management, scheduling, adoptability, segregation, scalability, performance-overhead, availability, programmability, time-to-market, security, and mainly, multitenancy. This paper provides an extensive survey covering three important aspects—discussion on non-standard terms used in existing literature, network-on-chip evaluation choices as a mean to explore the communication architecture, and virtualization methods under latest classification. The purpose is to emphasize the importance of choosing appropriate communication architecture, virtualization technique and standard language to evolve the multi-tenant FPGAs in datacenters. None of the previous surveys encapsulated these aspects in one writing. Open problems are indicated for scientific community as well.
“…Custom clusters are based on the concept of systolic array model in parallel computing architecture, where every node acts as a data processing unit and processed data move from one node to another through first-in first-out (FIFO) buffer or network semantics. Some of these architectures [115][116][117][118] use Peer to Peer (P2P) connection MaxRing, fast series transceivers with FIFO buffers, and Peripheral Component Interconnect Express (PCIe) links, for transmitting data across multiple nodes. Tailored designs allow the direct communication among the nodes through explicit network connections.…”
Modern datacenters are reinforcing the computational power and energy efficiency by assimilating field programmable gate arrays (FPGAs). The sustainability of this large-scale integration depends on enabling multi-tenant FPGAs. This requisite amplifies the importance of communication architecture and virtualization method with the required features in order to meet the high-end objective. Consequently, in the last decade, academia and industry proposed several virtualization techniques and hardware architectures for addressing resource management, scheduling, adoptability, segregation, scalability, performance-overhead, availability, programmability, time-to-market, security, and mainly, multitenancy. This paper provides an extensive survey covering three important aspects—discussion on non-standard terms used in existing literature, network-on-chip evaluation choices as a mean to explore the communication architecture, and virtualization methods under latest classification. The purpose is to emphasize the importance of choosing appropriate communication architecture, virtualization technique and standard language to evolve the multi-tenant FPGAs in datacenters. None of the previous surveys encapsulated these aspects in one writing. Open problems are indicated for scientific community as well.
“…e accelerators must be accessed through an underlying infrastructure that interfacing host and FPGA accelerators. ere are currently a few numbers of academic [5,[8][9][10][11] and industrial [6, 7] frameworks that are designed to connect a host to the FPGA accelerators. However, they either do not support or fail to provide a seamless interface to multiple accelerators accessed simultaneously by various applications.…”
Section: Background and Motivationmentioning
confidence: 99%
“…rough a static accelerator allocation [5][6][7][8][9][10] exploiting parallelism is not practically possible, since applications are not aware of the status of all the accelerators on FPGAs.…”
Section: Background and Motivationmentioning
confidence: 99%
“…For every single request from the host, an accelerator on the FPGA is allocated to the request until the result is sent back to the host. To the best of our knowledge, all the current FPGA accelerator frameworks [5][6][7][8][9][10] follow a static accelerator allocation scheme; it means that so ware developers have to exactly specify the target accelerators for any access request in the so ware code. is can lead to a poor utilization of accelerators when being shared among various applications.…”
Despite all the available commercial and open-source frameworks to ease deploying FPGAs in accelerating applications, the current schemes fail to support sharing multiple accelerators among various applications. ere are three main features that an accelerator sharing scheme requires to support: exploiting dynamic parallelism of multiple accelerators for a single application, sharing accelerators among multiple applications, and providing a non-blocking congestion-free environment for applications to invoke the accelerators. In this paper, we developed a scalable fully functional hardware controller, called UltraShare, with a supporting so ware stack that provides a dynamic accelerator sharing scheme through an accelerators grouping mechanism. UltraShare allows so ware applications to fully utilize FPGA accelerators in a non-blocking congestion-free environment. Our experimental results for a simple scenario of a combination of three streaming accelerators invocation show an improvement of up to 8x in throughput of the accelerators by removing accelerators idle times.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.