This paper advocates programming high-performance code using partial evaluation. We present a clean-slate programming system with a simple, annotation-based, online partial evaluator that operates on a CPS-style intermediate representation. Our system exposes code generation for accelerators (vectorization/parallelization for CPUs and GPUs) via compiler-known higher-order functions that can be subjected to partial evaluation. This way, generic implementations can be instantiated with target-specific code at compile time. In our experimental evaluation we present three extensive case studies from image processing, ray tracing, and genome sequence alignment. We demonstrate that using partial evaluation, we obtain high-performance implementations for CPUs and GPUs from one language and one code base in a generic way. The performance of our codes is mostly within 10%, often closer to the performance of multi man-year, industry-grade, manuallyoptimized expert codes that are considered to be among the top contenders in their fields. CCS Concepts: • Software and its engineering → Compilers; Parallel programming languages; • Hardware → Emerging languages and compilers; • Computing methodologies → Image processing; Ray tracing; • Applied computing → Bioinformatics;
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.