“…High-performance compilation for GPUs, by now, is hardly a new topic, and many different approaches have been used, including ones using OpenMP-style directives [14,17], ones that are fully automatic [27], ones based on functional languages [23], and ones based on the polyhedral model [26]. Other ones define an automatic, array computation middleware [11] designed as a back-end for multiple languages, including Python. Automatic, GPU-targeted compilers for languages embedded in Python also abound [4,6,20], most of which transform a Python AST at runPermission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.…”