The SARC architecture is composed of multiple processor types and a set of user-managed direct memory access (DMA) engines that let the runtime scheduler overlap data transfer and computation. The runtime system automatically allocates tasks on the heterogeneous cores and schedules the data transfers through the DMA engines. SARC's programming model supports various highly parallel applications, with matching support from specialized accelerator processors. On-chip parallel computation shows great promise for scaling raw processing performance within a given power budget. However, chip multiprocessors (CMPs) often struggle with programmability and scalability issues such as cache coherency and off-chip memory bandwidth and latency.
The Inferior-Olivary nucleus (ION) is a well-charted region of the brain, heavily associated with sensorimotor control of the body. It comprises ION cells with unique properties which facilitate sensory processing and motor-learning skills. Various simulation models of ION-cell networks have been written in an attempt to unravel their mysteries. However, simulations become rapidly intractable when biophysically plausible models and meaningful network sizes (≥100 cells) are modeled. To overcome this problem, in this work we port a highly detailed ION cell network model, originally coded in Matlab, onto an FPGA chip. It was first converted to ANSI C code and extensively profiled. It was, then, translated to HLS C code for the Xilinx Vivado toolflow and various algorithmic and arithmetic optimizations were applied. The design was implemented in a Virtex 7 (XC7VX485T) device and can simulate a 96-cell network at real-time speed, yielding a speedup of ×700 compared to the original Matlab code and ×12.5 compared to the reference C implementation running on a Intel Xeon 2.66GHz machine with 20GB RAM. For a 1,056-cell network (non-real-time), an FPGA speedup of ×45 against the C code can be achieved, demonstrating the design's usefulness in accelerating neuroscience research. Limited by the available on-chip memory, the FPGA can maximally support a 14,400-cell network (non-real-time) with online parameter configurability for cell state and network size. The maximum throughput of the FPGA IONnetwork accelerator can reach 2.13 GFLOPS.
Abstract-Sequence alignment is a fundamental instrument in Bioinformatics. In recent years, numerous proposals have been addressing the problem of accelerating this class of applications. This, due to the rapid growth of sequence databases in combination with the high computational demands imposed by the algorithms. In this paper we focus on the analysis of the progressive alignment in ClustalW, a widely used program for performing multiple sequence alignment. We have parallelized ClustalW for the Cell processor architecture and have carefully analyzed the scalability of its different phases with both the number of cores used and the input size. Experimental results show that computing profile scores scales well up to 16 SPE cores. With the increase of the input size, profiles initialization in the PPE core becomes the predominant bottleneck.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.