2012 41st International Conference on Parallel Processing Workshops 2012
DOI: 10.1109/icppw.2012.62
|View full text |Cite
|
Sign up to set email alerts
|

Profiling of OpenMP Tasks with Score-P

Abstract: With the task construct, the OpenMP 3.0 specification introduces an additional level of parallelism that challenges established schemes of performance profiling. First, a thread may execute a sequence of interleaved task fragments the profiling system must properly distinguish to enable correct performance analyses. Furthermore, the additional parallelization dimension requires new visualization methods for presenting analysis results. Finally, as a new programming paradigm, tasking implicitly introduces parad… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
4
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 14 publications
(5 citation statements)
references
References 15 publications
0
5
0
Order By: Relevance
“…In some of the benchmarked applications, we observed that the replay time for p = 1 is slightly bigger than the execution time of the original code. This happens due to small perturbation effects of task instrumentation [25]; the impact of this effect, however, is minimal. Table 2 presents the models for T ∞ (n) and π(n) (average parallelism) that were created using the results from the TDG analysis.…”
Section: Analysis Of the Resultsmentioning
confidence: 99%
“…In some of the benchmarked applications, we observed that the replay time for p = 1 is slightly bigger than the execution time of the original code. This happens due to small perturbation effects of task instrumentation [25]; the impact of this effect, however, is minimal. Table 2 presents the models for T ∞ (n) and π(n) (average parallelism) that were created using the results from the TDG analysis.…”
Section: Analysis Of the Resultsmentioning
confidence: 99%
“…The former shows stub nodes at execution locations along the main call tree, and the latter shows a subtree for every task construct. The Score-P task profiling mechanism and resulting profile data is explained in [13] in more detail.…”
Section: Task Granularitymentioning
confidence: 99%
“…On the other hand, we spend 607s creating tasks. Former task analysis examples showed that task switches and task completion can require roughly the same amount of execution time which will appear as exclusive execution time in the barrier [13]. In principle, task dependency structures may limit parallelism.…”
Section: Task Granularitymentioning
confidence: 99%
See 1 more Smart Citation
“…With respect to future trace analysis enhancements, we plan to extend the current OpenMP analysis of Scalasca with the analysis of OpenMP tasks. Score-P can already record task events [12,9]. However, we must extend Scalasca's profile construction algorithm and we want to add some task specific patterns to its analysis.…”
Section: Future Workmentioning
confidence: 99%