“…Hence, while previous application studies with AsyncSHMEM have focused on the use of asynchronous task parallelism for hybrid applications [13], in G500 we focus more on concurrency and programmability. That is, using a single runtime thread per PE with computation and communication multiplexed cooperatively on it by the AsyncSHMEM runtime.…”