“…Over the last decades various approaches on reducing or eliminating the synchronization bottleneck in Krylov subspace methods have been proposed [64,10,9,17,20,26,18,12]. Recent methods that aim to eliminate global synchronization points include improved Krylov subspace methods [73,72,74], hierarchical Krylov subspace methods [46], enlarged Krylov subspace methods [39], an iteration fusing Conjugate Gradient method [75], s-step Krylov subspace methods [11,43,6,5,4,44], and pipelined Krylov subspace methods [31,32,57,24,71]. Pipelined Krylov subspace methods aim to avoid communication latency by reducing the number of global synchronization bottlenecks and by hiding global communication behind useful computational work.…”