2012 41st International Conference on Parallel Processing 2012
DOI: 10.1109/icpp.2012.9
|View full text |Cite
|
Sign up to set email alerts
|

A Hierarchical Approach for Load Balancing on Parallel Multi-core Systems

Abstract: Abstract-Multi-core compute nodes with non-uniform memory access (NUMA) are now a common architecture in the assembly of large-scale parallel machines. On these machines, in addition to the network communication costs, the memory access costs within a compute node are also asymmetric. Ignoring this can lead to an increase in the data movement costs. Therefore, to fully exploit the potential of these nodes and reduce data access costs, it becomes crucial to have a complete view of the machine topology (i.e. the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
15
0
3

Year Published

2013
2013
2018
2018

Publication Types

Select...
6
3
1

Relationship

1
9

Authors

Journals

citations
Cited by 33 publications
(18 citation statements)
references
References 15 publications
0
15
0
3
Order By: Relevance
“…Rashti [29] et al show how a better match between the communication and physical topologies on MPI applications can bring considerable gains in communication performance. Charm++ on NUMA machines face many of the same problems of actor-based REs and it has been shown that [30] NUMA awareness might bring improvements on the overall system performance.…”
Section: Related Workmentioning
confidence: 98%
“…Rashti [29] et al show how a better match between the communication and physical topologies on MPI applications can bring considerable gains in communication performance. Charm++ on NUMA machines face many of the same problems of actor-based REs and it has been shown that [30] NUMA awareness might bring improvements on the overall system performance.…”
Section: Related Workmentioning
confidence: 98%
“…Rashti [23] et al show how a better match between the communication and physical topologies on MPI applications can bring considerable gains in communication performance. Charm++ on NUMA machines face many of the same problems of actor-based REs and it has been shown that [20] NUMA awareness might bring improvements on the overall system performance.…”
Section: Related Workmentioning
confidence: 98%
“…Rashti [14] et al show how a better match between the communication and physical topologies on MPI applications can bring considerable gains in communication performance. Charm++ on NUMA machines face many of the same problems of actor REs and it has been shown that [11] NUMA awareness might bring improvements on the overall system performance. Aubry et al [16] demonstrate a complete dataflow based RE for the MPPA-256 architecture.…”
Section: Related Workmentioning
confidence: 98%