High-bandwidth connectivity provided by WDM optical interconnects is an important enabler for delocalized hardware accelerators in utility computing. We validate a proposed architecture with an experiment that leverages optical interconnects to demonstrate error free (BER<10e-12) active switching and multicasting of FPGA generated and received packets.
IntroductionWith the recent growth in cloud computing, the concept of utility computing (architecture model in which hardware resources are offered as on-demand services) has emerged as a new paradigm. Not only are these cloud computing systems able to parallelize the workload of an application across multiple processors, they can also offer specialized hardware to off-load and accelerate programing execution [1]. Known as hardware accelerators, these compute nodes are faster at performing specific calculations than general purpose processors. Computation kernels that are often found in cloud computing algorithms, such as pattern matching and digital signal processing, can greatly benefit from hardware acceleration. Graphics Processing Units (GPUs) and other forms of hardware acceleration are already being used commercially in the finance industry, and can achieve up to a 24x increase in performance, lower latency, and consumption of less power than systems without hardware acceleration [1].Communication between the Central Processing Unit (CPU) and these hardware accelerators must be high bandwidth, low latency, and energy efficient [2]. Due to these demands and the power limitations associated with high-speed electronic communications over long distances, accelerators must be placed physically close to the CPU (localized on the motherboard). This architectural limitation severely constrains the number of accelerators each CPU can directly access (only those local to it), and can lead to the underutilization of these accelerators (can't access more accelerators than those local to it) [2]. Delocalizing the accelerators into an architecture with a central bank of accelerators would allow them to be dynamically allocated to different tasks. However, in order for this system to be successful, current electronic networks are inadequate. Optical interconnects offer the bandwidth, latency, and energy specifications necessary to support delocalized accelerators [3]. We propose a novel system architecture with an optically-connected bank of hardware accelerators. This specialized hardware would offer applications always-on, always-accessible acceleration for their specific computational tasks.While this architecture would require additional hardware to manage the accelerator network, this hardware is minimal and does not introduce significant complexity to the design. Programming hardware accelerators is an additional hurdle, but with the development of languages like the Open Computing Language (OpenCL), which enables programs to be written in a way so that they will work across CPUs and GPUs, and IBM's Liquid Metal, a comprehensive compiler and run-time syst...