Progress Thread Placement for Overlapping MPI Non-blocking Collectives Using Simultaneous Multi-threading

Denis, Alexandre; Jaeger, Julien; Taboada, Hugo

doi:10.1007/978-3-030-10549-5_10

Cited by 2 publications

(2 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We found that using a dedicated core is essential to guarantee sufficient progression of MPI messages and achieving our objective of fine-granular reactivity. Similar findings have been reported in [4,6]. In contrast to predictive load balancing, there is no mutual a-priori agreement on the task migration pattern.…”

Section: Communication Infrastructuresupporting

confidence: 75%

Reactive Task Migration for Hybrid MPI+OpenMP Applications

Klinkenberg

Samfass

Bäder

et al. 2020

Parallel Processing and Applied Mathematics

View full text Add to dashboard Cite

Many applications in high performance computing are designed based on underlying performance and execution models. While these models could successfully be employed in the past for balancing load within and between compute nodes, modern software and hardware increasingly make performance predictability difficult if not impossible. Consequently, balancing computational load becomes much more difficult. Aiming to tackle these challenges in search for a general solution, we present a novel library for fine-granular task-based reactive load balancing in distributed memory based on MPI and OpenMP. With our approach, individual migratable tasks can be executed on any MPI rank. The actual executing rank is determined at run time based on online performance data. We evaluate our approach under an enforced power cap and under enforced clock frequency changes for a synthetic benchmark and show its robustness for work-induced imbalances for a realistic application. Our experiments demonstrate speedups of up to 1.31X.

show abstract

Section: Communication Infrastructuresupporting

confidence: 75%

Reactive Task Migration for Hybrid MPI+OpenMP Applications

Klinkenberg

Samfass

Bäder

et al. 2020

Parallel Processing and Applied Mathematics

View full text Add to dashboard Cite

show abstract

“…This option is typically implemented with threads, which handle the status of the non-blocking operations and perform the corresponding progression. The drawback for this strategy is related with significant overhead, produced by the progression threads [26,27,28,29]. The manual progression is generally independent on the hardware and MPI library implementation, but needs some user efforts to add MPI Test or MPI Probe calls to progress the communications.…”

Section: Global Communicationsmentioning

confidence: 99%

Revisiting performance of BiCGStab methods for solving systems with multiple right-hand sides

Krasnopolsky

2020

Computers & Mathematics with Applications

View full text Add to dashboard Cite

The paper discusses the efficiency of the classical BiCGStab method and several of its modifications for solving systems with multiple right-hand side vectors. These iterative methods are widely used for solving systems with large sparse matrices. The paper presents execution time analytical model for the time to solve the systems. The BiCGStab method and several modifications including the Reordered BiCGStab and Pipelined BiCGStab methods are analyzed and the range of applicability for each method providing the best execution time is highlighted. The results of the analytical model are validated by the numerical experiments and compared with results of other authors. The presented results demonstrate an increasing role of the vector operations when performing simulations with multiple right-hand side vectors. The proposed merging of vector operations allows to reduce the memory traffic and improve performance of the calculations by about 30%.

show abstract

Progress Thread Placement for Overlapping MPI Non-blocking Collectives Using Simultaneous Multi-threading

Cited by 2 publications

References 14 publications

Reactive Task Migration for Hybrid MPI+OpenMP Applications

Reactive Task Migration for Hybrid MPI+OpenMP Applications

Revisiting performance of BiCGStab methods for solving systems with multiple right-hand sides

Contact Info

Product

Resources

About