As the size of FPGA devices grows following Moore's law, it becomes possible to put a complete manycore system onto a single FPGA chip. The centralized memory hierarchy on typical embedded systems in which both data and instructions are stored in the off-chip global memory will introduce the bus contention problem as the number of processing cores increases. In this work, we present our exploration into how distributed multi-tiered memory hierarchies can effect the scalability of manycore systems. We use the Xilinx Virtex FPGA devices as the testing platforms and the buses as the interconnect. Several variances of the centralized memory hierarchy and the distributed memory hierarchy are compared by running various benchmarks, including matrix multiplication, IDEA encryption and 3D FFT. The results demonstrate the good scalability of the distributed memory hierarchy for systems up to 32 Mi-croBlaze processors, which is constrained by the FPGA resources on the Virtex-6LX240T device.
FPGA densities have continued to follow Moore's law and can now support a complete multiprocessor system on programmable chip. The benefits of the FPGA include the ability to build a customized MPSoC system consisting of heterogeneous processing resources, interconnects and memory hierarchies that best match the requirements of each application. In this paper we outline a new approach that allows users to drive the generation of a complete hardware/software co-designed multiprocessor system on programmable chip from an unaltered standard high level programming model. We use OpenCL as our specification framework and show how key API's are extracted and used to automatically create a distributed shared memory multiprocessor system on chip architecture for Xilinx FPGA's. We show how OpenCL API's are easily translated to hthreads, a hardware-based microkernel operating system to provide pthreads compliant run time services within the MPSoPC architecture.
Modern platform FPGAs are over the million-LUT level, large enough to support complete heterogeneous Multiprocessor System-On-Chips (MPSoCs). Constructing systems with 10's of processors is currently feasible using existing manual methods within vendor-specific CAD tools. However these manual, by-hand, approaches will not be feasible for constructing future systems with 100's to 1,000's of processors. Instead, new automated system assembly approaches will be required to handle these levels of system complexity and diversity. In this paper we present a new automated design flow for creating such next generation heterogeneous MPSoCs. An integral part of the MPSoPC system created is the inclusion of a general purpose PThreadscompliant HW/SW co-designed operating system and heterogeneous compiler. Our design flow has been placed in the cloud and is freely accessible across the Internet.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.