We present Lyncs-API, a Python API for Lattice QCD applications currently under development. Lyncs aims to bring several widely used libraries for Lattice QCD under a common framework. Lyncs flexibly links to libraries for CPUs and GPUs in a way that can accommodate additional computing architectures as these arise, achieving performance-portability for the calculations while maintaining the same high-level workflow. Lyncs distributes calculations using Dask and mpi4py, with bindings to the libraries automatically generated by cppyy. While Lyncs is designed to allow linking to multiple libraries, we focus on a set of targeted packages that include DDalphaAMG, tmLQCD, QUDA and c-lime. More libraries will be added in the future. We also develop genericpurpose tools for facilitating the usage of Python in Lattice QCD and HPC in general. The project is open-source, community-oriented and available on Github.
Sparse matrices are an integral part of scientific simulations. As hardware evolves new sparse matrix storage formats are proposed aiming to exploit optimizations specific to the new hardware. In the era of heterogeneous computing, users often are required to use multiple formats for their applications to remain optimal across the different available hardware, resulting in larger development times and maintenance overhead. A potential solution to this problem is the use of a lightweight auto-tuner driven by Machine Learning (ML) that would select for the user an optimal format from a pool of available formats that will match the characteristics of the sparsity pattern, target hardware and operation to execute.In this paper, we introduce Morpheus-Oracle, a library that provides a lightweight ML auto-tuner capable of accurately predicting the optimal format across multiple backends, targeting the major HPC architectures aiming to eliminate any format selection input by the end-user. From more than 2000 reallife matrices, we achieve an average classification accuracy and balanced accuracy of 92.63% and 80.22% respectively across the available systems. The adoption of the auto-tuner results in average speedup of 1.1× on CPUs and 1.5× to 8× on NVIDIA and AMD GPUs, with maximum speedups reaching up to 7× and 1000× respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.