Recently, digital signal processors featuring vector SIMD instructions have gained renewed attention, since they offer the potential to speed up the computation of digital signal processing algorithms. However, when implementing recursive algorithms the maximum achievable speed up factors are upper bounded. In this paper we investigate these performance limitations when pure recursive filters are implemented into parallel DSP architectures. We show that by applying algebraic transformations a block formulation of any recursive filter can be derived, which can be efficiently implemented into SIMD DSP architectures. We also show that the number of additional vector operations introduced by the transformation grows linearly with the level of parallelism and that it does not depend on the recursion order. These results enable the achievement of important speed up factors even for low order recursions. Moreover, we introduce a suitable algebraic notation of the block formulation of the recursive filter, which reveals the processor instructions required to implement the algorithm into the SIMD DSP.
Abstract-SIMD processors have made their way from supercomputers architectures through embedded real-time signal processing. This trend has been driven by signal processing applications with heavy number-crunching requirements like for example base-band processing on mobile devices.Depending on the data dependencies of algorithms and implementation constraints like real-time, power consumption and die size, the necessary SIMD parallelism can be put into a piece of silicon for a certain application. This poses two challenges: On the one hand, the DSP core design has to be streamlined in such a way that changes on the architecture can be prototyped very fast. On the other hand, the algorithm design and its development have to be done independent of the level of SIMD parallelism available on the DSP in order to enable software reusability.In this paper we report our HW/SW methodology in order to design DSP cores and algorithms that exploit SIMD parallelism. On the hardware development side and taking as a starting point a novel hardware architectural template called STA 1 , we explain how with our approach we automatically generate simulation and hardware models of DSP cores with a scalable level of SIMD parallelism. On the software development side and based on an algebraic model that captures the SIMD computational model, we explain how algorithms can be designed independent of the available SIMD parallelism. We also report how this algebraic model can be easily expressed in Matlab syntax. This enables the automatic code generation from Matlab programs for our family of DSP cores.
Abstract-Taking as a starting point a collection of algebraic primitives that captures the SIMD computational model, we show in this paper our methodology for designing, mapping and implementing algorithms for SIMD-vector signal processors with scalable level of parallelism. Taking as an example the LMS, we show how an algorithm, which has been designed to exhibit a suitable level of data parallelism can be described by these algebraic primitives. In turn, these algebraic primitives are programmed in a matrix oriented language. A suitable compiler generates object code for SIMD processors with a scalable number of processing elements.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.