This paper describes a fast and efficient hardware-accelerated pseudoinverse computation through algorithm restructuring and leveraging FPGA synthesis directives for parallelism prior to high-level synthesis (HLS). The algorithm, which is composed of modified Gram-Schmidt QR decomposition (MGS-QRD), triangular matrix inversion (TMI), and matrix multiplication (MM), is synthesized and implemented on a field-programmable gate array (FPGA). MGS-QRD is restructured and augmented with parallelism directives prior to synthesizing the algorithm, which yielded an MGS-QRD hardware accelerator with high throughput. Modifications to the current TMI algorithm were also proposed, in which the removal of redundant computational tasks was done in order to speed up overall operation. Data dependencies in the MM algorithm were carefully considered such that appropriate parallelism directives were inserted, and matching the data flow of MM with MGS-QRD and TMI modules was also performed to accelerate the pseudoinverse computation. The results showed that the proposed pseudoinverse module is better than the naïve implementation which is composed of existing MGS-QRD, TMI and a standard MM in terms of maximum frequency (1.24Â speedup), hardware resources (48% of reduction of DSP usage), latency (23% reduction), and throughput (62% increase).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.