Drug-Eluting Beads Versus Conventional Chemoembolization for the Treatment of Unresectable Hepatocellular Carcinoma: A Meta-Analysis

Pulse-Doppler radars require high-computing power. A massively parallel machine has been developed in this paper to implement a Pulse-Doppler radar signal processing chain in real-time fashion. The proposed machine consists of two C6678 digital signal processors (DSPs), each with eight DSP cores, interconnected with Serial RapidIO (SRIO) bus. In this study, each individual core is considered as the basic processing element; hence, the proposed parallel machine contains 16 processing elements. A straightforward model has been adopted to distribute the Pulse-Doppler radar signal processing chain. This model provides low latency, but communication inefficiency limits system performance. This paper proposes several optimizations that greatly reduce the inter-processor communication in a straightforward model and improves the parallel efficiency of the system. A use case of the Pulse-Doppler radar signal processing chain has been used to illustrate and validate the concept of the proposed mapping model. Experimental results show that the parallel efficiency of the proposed parallel machine is about 90%.

show abstract

Instruction scheduling heuristic for an efficient FFT in VLIW processors with balanced resource usage

Bahtat

Belkouch

Elleaume

et al. 2016

EURASIP J. Adv. Signal Process.

View full text Add to dashboard Cite

The fast Fourier transform (FFT) is perhaps today's most ubiquitous algorithm used with digital data; hence, it is still being studied extensively. Besides the benefit of reducing the arithmetic count in the FFT algorithm, memory references and scheme's projection on processor's architecture are critical for a fast and efficient implementation. One of the main bottlenecks is in the long latency memory accesses to butterflies' legs and in the redundant references to twiddle factors. In this paper, we describe a new FFT implementation on high-end very long instruction word (VLIW) digital signal processors (DSP), which presents improved performance in terms of clock cycles due to the resulting low-level resource balance and to the reduced memory accesses of twiddle factors. The method introduces a tradeoff parameter between accuracy and speed. Additionally, we suggest a cache-efficient implementation methodology for the FFT, dependently on the provided VLIW hardware resources and cache structure. Experimental results on a TI VLIW DSP show that our method reduces the number of clock cycles by an average of 51 % (2 times acceleration) when compared to the most assembly-optimized and vendor-tuned FFT libraries. The FFT was generated using an instruction-level scheduling heuristic. It is a modulo-based register-sensitive scheduling algorithm, which is able to compute an aggressively efficient sequence of VLIW instructions for the FFT, maximizing the parallelism rate and minimizing clock cycles and register usage.

show abstract

Performance optimization of high-speed Interconnect Serial RapidIO for onboard processing

Klilou

Belkouch

Elleaume

et al. 2012

View full text Add to dashboard Cite

Efficient implementation scheme of a real-time radar beamformer on a VLIW DSP processor, TMS320C66x TI DSP implementation

Bahtat

Belkouch

Elleaume

et al. 2012

View full text Add to dashboard Cite

Case studies of data traffic management on a high-performance computing system based on multi-DSPs and Serial RapidIO interconnect

Klilou

Belkouch

Elmaizi

et al. 2016

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Philippe Elleaume

Real-time parallel implementation of Pulse-Doppler radar signal processing chain on a massively parallel machine based on multi-core DSP and Serial RapidIO interconnect

Instruction scheduling heuristic for an efficient FFT in VLIW processors with balanced resource usage

Performance optimization of high-speed Interconnect Serial RapidIO for onboard processing

Efficient implementation scheme of a real-time radar beamformer on a VLIW DSP processor, TMS320C66x TI DSP implementation

Case studies of data traffic management on a high-performance computing system based on multi-DSPs and Serial RapidIO interconnect

Contact Info

Product

Resources

About