Porting applications to new hardware or programming models is a tedious and error prone process. Every help that eases these burdens is saving developer time that can then be invested into the advancement of the application itself instead of preserving the status-quo on a new platform.The Alpaka library defines and implements an abstract hierarchical redundant parallelism model. The model exploits parallelism and memory hierarchies on a node at all levels available in current hardware. By doing so, it allows to achieve platform and performance portability across various types of accelerators by ignoring specific unsupported levels and utilizing only the ones supported on a specific accelerator. All hardware types (multi-and many-core CPUs, GPUs and other accelerators) are supported for and can be programmed in the same way. The Alpaka C++ template interface allows for straightforward extension of the library to support other accelerators and specialization of its internals for optimization.Running Alpaka applications on a new (and supported) platform requires the change of only one source code line instead of a lot of #ifdefs. * This project has received funding from the European Unions Horizon 2020 research and innovation programme under grant agreement No 654220
Pheochromocytomas and extra-adrenal paragangliomas (PHEO/PGLs) are rare catecholamine-producing chromaffin cell tumors. For metastatic disease, no effective therapy is available. Overexpression of somatostatin type 2 receptors (SSTR2) in PHEO/PGLs promotes interest in applying therapies using somatostatin analogs linked to radionuclides and/or cytotoxic compounds, such as [177Lu]Lu-DOTA-(Tyr3)octreotate (DOTATATE) and AN-238. Systematic evaluation of such therapies for the treatment of PHEO/PGLs requires sophisticated animal models. In this study, the mouse pheochromocytoma (MPC)-mCherry allograft model showed high tumor densities of murine SSTR2 (mSSTR2) and high tumor uptake of [64Cu]Cu-DOTATATE. Using tumor sections, we assessed mSSTR2-specific binding of DOTATATE, AN-238, and somatostatin-14. Therapeutic studies showed substantial reduction of tumor growth and tumor-related renal monoamine excretion in tumor-bearing mice after treatment with [177Lu]Lu-DOTATATE compared to AN-238 and doxorubicin. Analyses did not show agonist-dependent receptor downregulation after single mSSTR2-targeting therapies. This study demonstrates that the MPC-mCherry model is a uniquely powerful tool for the preclinical evaluation of SSTR2-targeting theranostic applications in vivo. Our findings highlight the therapeutic potential of somatostatin analogs, especially of [177Lu]Lu-DOTATATE, for the treatment of metastatic PHEO/PGLs. Repeated treatment cycles, fractionated combinations of SSTR2-targeting radionuclide and cytotoxic therapies, and other adjuvant compounds addressing additional mechanisms may further enhance therapeutic outcome.
We present an analysis on optimizing performance of a single C++11 source code using the Alpaka hardware abstraction library. For this we use the general matrix multiplication (GEMM) algorithm in order to show that compilers can optimize Alpaka code effectively when tuning key parameters of the algorithm. We do not intend to rival existing, highly optimized DGEMM versions, but merely choose this example to prove that Alpaka allows for platform-specific tuning with a single source code. In addition we analyze the optimization potential available with vendor-specific compilers when confronted with the heavily templated abstractions of Alpaka. We specifically test the code for bleeding edge architectures such as Nvidia's Tesla P100, Intel's Knights Landing (KNL) and Haswell architecture as well as IBM's Power8 system. On some of these we are able to reach almost 50% of the peak floating point operation performance using the aforementioned means. When adding compilerspecific #pragmas we are able to reach 5 TFLOPs /s on a P100 and over 1 TFLOPs /s on a KNL system.
With the appearance of the heterogeneous platform Open-Power, many-core accelerator devices have been coupled with Power host processors for the first time. Towards utilizing their full potential, it is worth investigating performance portable algorithms that allow to choose the best-fitting hardware for each domain-specific compute task. Suiting even the high level of parallelism on modern GPGPUs, our presented approach relies heavily on abstract meta-programming techniques, which are essential to focus on fine-grained tuning rather than code porting. With this in mind, the CUDA-based open-source plasma simulation code PIConGPU is currently being abstracted to support the heterogeneous OpenPower platform using our fast porting interface cupla, which wraps the abstract parallel C++11 kernel acceleration library Alpaka. We demonstrate how PIConGPU can benefit from the tunable kernel execution strategies of the Alpaka library, achieving portability and performance with single-source kernels on conventional CPUs, Power8 CPUs and NVIDIA GPUs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.