Abstract-Fast changing, increasingly complex, and diverse computing platforms pose central problems in scientific computing: How to achieve, with reasonable effort, portable optimal performance? We present SPIRAL that considers this problem for the performance-critical domain of linear digital signal processing (DSP) transforms. For a specified transform, SPIRAL automatically generates high performance code that is tuned to the given platform. SPIRAL formulates the tuning as an optimization problem, and exploits the domain-specific mathematical structure of transform algorithms to implement a feedback-driven optimizer. Similar to a human expert, for a specified transform, SPIRAL "intelligently" generates and explores algorithmic and implementation choices to find the best match to the computer's microarchitecture. The "intelligence" is provided by search and learning techniques that exploit the structure of the algorithm and implementation space to guide the exploration and optimization. SPIRAL generates high performance code for a broad set of DSP transforms including the discrete Fourier transform, other trigonometric transforms, filter transforms, and discrete wavelet transforms. Experimental results show that the code generated by SPIRAL competes with, and sometimes outperforms, the best available human tuned transform library code.
A variable can be multiplied by a given set of fixed-point constants using a multiplier block that consists exclusively of additions, subtractions, and shifts. The generation of a multiplier block from the set of constants is known as the multiple constant multiplication (MCM) problem. Finding the optimal solution, i.e., the one with the fewest number of additions and subtractions is known to be NP-complete. We propose a new algorithm for the MCM problem, which produces solutions that require up to 20% less additions and subtractions than the best previously known algorithm. At the same time our algorithm, in contrast to the closest competing algorithm, is not limited by the constant bitwidths. We present our algorithm using a unifying formal framework for the best, graph-based, MCM algorithms and provide a detailed runtime analysis and experimental evaluation. We show that our algorithm can handle problem sizes as large as 100 32-bit constants in a time acceptable for most applications. The implementation of the new algorithm is available at www.spiral.net.
A critical optimization in the domain of linear signal transforms, such as the discrete Fourier transform (DFT), is loop merging, which increases data locality and reuse and thus performance. In particular, this includes the conversion of shuffle operations into array reindexings. To date, loop merging is well understood only for the DFT, and only for Cooley-Tukey FFT based algorithms, which excludes DFT sizes divisible by large primes. In this paper, we present a formal loop merging framework for general signal transforms and its implementation within the SPIRAL code generator. The framework consists of Σ-SPL, a mathematical language to express loops and index mappings; a rewriting system to merge loops in Σ-SPL; and a compiler that translates Σ-SPL into code. We apply the framework to DFT sizes that cannot be handled using only the Cooley-Tukey FFT and compare our method to FFTW 3.0.1 and the vendor library Intel MKL 7.2.1. Compared to FFTW our generated code is a factor of 2-4 faster under equal implementation conditions (same algorithms, same unrolling threshold). For some sizes we show a speed-up of a factor of 9 using Bluestein's algorithm. Further, we give a detailed comparison against the Intel vendor library MKL; our generated code is between 2 times faster and 4.5 times slower.
The emerging biological understanding of metastatic cancer and proof-of-concept clinical trials suggest that debulking all gross disease holds great promise for improving patient outcomes. However, ablation of multiple targets with conventional external beam radiotherapy (EBRT) systems is burdensome, which limits investigation and utilization of complete metastatic ablation in the majority of patients with advanced disease. To overcome this logistical hurdle, technical innovation is necessary. Biology-guided radiotherapy (BgRT) is a new EBRT delivery modality combining PET-CT with a 6 MV linear accelerator. The key innovation is continuous response of the linear accelerator to outgoing tumor PET emissions with beamlets of radiotherapy at subsecond latency. This allows the deposited dose to track tumors in real time. Multiple new hardware and algorithmic advances further facilitate this low-latency feedback process. By transforming tumors into their own fiducials after intravenous injection of a radiotracer, BgRT has the potential to enable complete metastatic ablation in a manner efficient for a single patient and scalable to entire populations with metastatic disease. Future trends may further enhance the utility of BgRT in the clinic as this technology dovetails with other innovations in radiotherapy, including novel dose painting and fractionation schemes, radiomics, and new radiotracers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.