2010 IEEE International Symposium on Parallel &Amp; Distributed Processing (IPDPS) 2010
DOI: 10.1109/ipdps.2010.5470442
|View full text |Cite
|
Sign up to set email alerts
|

Structuring the execution of OpenMP applications for multicore architectures

Abstract: International audienceThe now commonplace multi-core chips have introduced, by design, a deep hierarchy of memory and cache banks within parallel computers as a tradeoff between the user friendliness of shared memory on the one side, and memory access scalability and efficiency on the other side. However, to get high performance out of such machines requires a dynamic mapping of application tasks and data onto the underlying architecture. Moreover, depending on the application behavior, this mapping should fav… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
31
0
1

Year Published

2013
2013
2020
2020

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 48 publications
(32 citation statements)
references
References 14 publications
0
31
0
1
Order By: Relevance
“…These mechanisms generate mapping information based on a very small number of samples compared to SAMMU, as all memory accesses are handled by the MMU. Some techniques such as Forest-GOMP [4] require annotations in the source code and depend on specific parallelization libraries. Similarly, Ogasawara [20] proposes a data mapping method that is limited to object oriented languages.…”
Section: Related Workmentioning
confidence: 99%
“…These mechanisms generate mapping information based on a very small number of samples compared to SAMMU, as all memory accesses are handled by the MMU. Some techniques such as Forest-GOMP [4] require annotations in the source code and depend on specific parallelization libraries. Similarly, Ogasawara [20] proposes a data mapping method that is limited to object oriented languages.…”
Section: Related Workmentioning
confidence: 99%
“…ForestGOMP [5,6] is an OpenMP run-time with a resourceaware scheduler and a NUMA-aware allocator. It introduces three concepts: grouping of OpenMP threads into bubbles, scheduling of threads and bubbles using a hierarchy of runqueues, and migrating data dynamically upon load balancing.…”
Section: Related Workmentioning
confidence: 99%
“…On the operating system side, optimizations are compelled to place tasks and data conservatively [13,24], unless provided with detailed affinity information by the application [5,6], high-level libraries [26] or domain specific languages [20]. Nevertheless, as task-parallel run-times operate in user-space, a separate kernel component would add additional complexity to the solution; this advocates for a user-space approach.…”
Section: Introductionmentioning
confidence: 99%
“…A library called ForestGOMP is introduced in [Broquedis et al 2010a]. This library integrates into the OpenMP runtime environment and gathers information about the different parallel sections of the applications.…”
Section: Joint Thread and Data Mappingmentioning
confidence: 99%