In this demo, we present fpga-ToPSS (Toronto Publish/Subscribe System Family), an efficient event processing platform for highfrequency and low-latency algorithmic trading. Our event processing platform is built over reconfigurable hardware-FPGAs-to achieve line-rate processing. Furthermore, our event processing engine supports Boolean expression matching with an expressive predicate language that models complex financial strategies to autonomously buy and sell stocks based on real-time financial data.
As FPGA-based systems including soft-processors become increasingly common we are motivated to better understand the best way to scale the performance of such systems. In this paper we explore the organization of processors and caches connected to a single off-chip memory channel, for workloads composed of many independent threads. In particular we design and evaluate real FPGA-based processor, multithreaded processor, and multiprocessor systems on EEMBC benchmarks-investigating different approaches to scaling caches, processors, and thread contexts to maximize throughput while minimizing area. Our main finding is that while a single multithreaded processor offers improved performance over a single-threaded processor, multiprocessors composed of single-threaded processors scale better than those composed of multithreaded processors.
As reconfigurable computing hardware and in particular FPGA-based systems-on-chip comprise an increasing number of processor and accelerator cores, supporting sharing and synchronization in a way that is scalable and easy to program becomes a challenge.
Transactional Memory
(TM) is a potential solution to this problem, and an FPGA-based system provides the opportunity to support TM in hardware (HTM). Although there are many proposed approaches to HTM support for ASICs, these do not necessarily map well to FPGAs. In particular in this work we demonstrate that while
signature
-based conflict detection schemes (essentially bit-vectors) should intuitively be a good match to the bit parallelism of FPGAs, previous approaches result in unacceptable multicycle stalls, operating frequencies, or false-conflict rates. Capitalizing on the reconfigurable nature of FPGA-based systems, we propose an application-specific signature mechanism for HTM conflict detection. Our evaluation uses real and projected FPGA-based soft multiprocessor systems that support HTM and implement threaded, shared-memory network packet processing applications. We find that our application-specific approach: (i) maintains a reasonable operating frequency of 125 MHz, (ii) achieves a 9% to 71% increase in packet throughput relative to signatures with bit selection on a 2-thread architecture, and (iii) allows our HTM to achieve 6%, 54%, and 57% increases in packet throughput on an 8-thread architecture versus a baseline lock-based synchronization for three of four packet processing applications studied, due to reduced false synchronization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.