e edge computing paradigm has emerged to handle cloud computing issues such as scalability, security and high response time among others. is new computing trend heavily relies on ubiquitous embedded systems on the edge. Performance and energy consumption are two main factors that should be considered during the design of such systems. Focusing on performance and energy consumption, this paper studies the opportunities and challenges that a heterogeneous embedded system consisting of embedded FPGAs and GPUs (as accelerators) can provide for applications. We study three design, modeling and scheduling challenges throughout the paper. We also propose three techniques to cope with these three challenges. Applying the proposed techniques to three applications including image histogram, dense matrix-vector multiplication and sparse matrix-vector multiplications show 1.79x and 2.29x improvements in performance and energy consumption, respectively, when both FPGA and GPU execute the corresponding application in parallel.
Multithreading programming can improve performance of an application especially to reduce processor busy waiting. Typically, threads that have to wait for input/output responses can wait in a queue (sleep queue), allowing other threads to utilize processor, therefore improving system timeliness and throughput. As such an application can be partitioned into several threads that can be executed on either single or multiple processors. Sharing of processors among threads however requires scheduling to ensure fair sharing scheme or to meet a specific execution objective. The scheduling mechanism serves to allocate which threads get to run on a processor alternately according to the adopted sharing scheme.Processor can be relieved of executing the required scheduling task if it can be performed by a hardware entity such as Field Programmable Gate Array (FPGA). This paper describes initial design of hardware scheduler and modification of thread manager to support the migration (of thread scheduler into the hardware). The scheduler is designed as an Intellectual Property (IP) core that can be instantiated like any peripheral core. The work is intended to enable multithreading on XILINX microkernel with a hardware thread scheduler instead of Von Neumann stored instruction scheduling execution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.