2013
DOI: 10.1016/j.jpdc.2012.09.016
|View full text |Cite
|
Sign up to set email alerts
|

KNEM: A generic and scalable kernel-assisted intra-node MPI communication framework

Abstract: The multiplication of cores in today's architectures raises the importance of intra-node communication in modern clusters and their impact on the overall parallel application performance. Although several proposals focused on this issue in the past, there is still a need for a portable and hardware-independent solution that addresses the requirements of both point-to-point and collective MPI operations inside shared-memory computing nodes. This paper presents the KNEM module for the Linux kernel that provides … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 61 publications
(15 citation statements)
references
References 29 publications
0
15
0
Order By: Relevance
“…In shared memory, the operating system allows the direct movement of data between two processes in just one transfer, for instance, using KNEM [41] or LiMIC [42] kernel modules in MPI, and high performance networks reduce the number of transfers through the use of RDMA (Remote Direct Memory Access) mechanisms [43].…”
Section: Modeling a Transmissionmentioning
confidence: 99%
“…In shared memory, the operating system allows the direct movement of data between two processes in just one transfer, for instance, using KNEM [41] or LiMIC [42] kernel modules in MPI, and high performance networks reduce the number of transfers through the use of RDMA (Remote Direct Memory Access) mechanisms [43].…”
Section: Modeling a Transmissionmentioning
confidence: 99%
“…However, double copies are required for point-to-point communication. Goglin et al [23] support efficient intra-node MPI communication for large messages, by using kernel-assisted direct copies between processes. However, for small messages (such as those used in PDES), they observe that the standard two-copy implementation performs better.…”
Section: Mpi On Shared Memory Architecturesmentioning
confidence: 99%
“…Multiple data transfers strategies have been proposed [1], including relying on the external network interface, on specific network drivers, on custom operating system features [2], or on user-level techniques such as shared buffers and pipelining. This was still an active research area recently through platform-independent directcopy mechanims such as LiMIC [3] and KNEM [4], and the inclusion of Cross Memory Attach [5] in the Linux kernel.…”
Section: B Too Many Configuration Optionsmentioning
confidence: 99%
“…These benchmarks are run using our mbench framework. 4 It offers easy ways to setup memory buffers to specific cache states and compute the corresponding memory access throughputs for different numbers of threads. Measuring the memory throughput for different buffer sizes also capture the performance and sizes of each level of cache, which allows us to explicitly ignore them in the model.…”
Section: Modeling Communication By Combining Microbenchmarksmentioning
confidence: 99%