2003
DOI: 10.1007/3-540-35767-x_23
|View full text |Cite
|
Sign up to set email alerts
|

Coarse Grain Task Parallel Processing with Cache Optimization on Shared Memory Multiprocessor

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2003
2003
2022
2022

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 14 publications
(4 citation statements)
references
References 13 publications
0
4
0
Order By: Relevance
“…In the proposed cache optimization scheme, the task scheduler assigns macrotasks inside a DLG to the same processor as consecutively as possible [18] in addition to the "critical path" priority used by the both static and dynamic scheduling. Figure 5 shows a schedule when the proposed cache optimization is applied to macro-task graph in Figure 4(b) for a single processor.…”
Section: Consecutive Execution Of Data Localizable Groupmentioning
confidence: 99%
“…In the proposed cache optimization scheme, the task scheduler assigns macrotasks inside a DLG to the same processor as consecutively as possible [18] in addition to the "critical path" priority used by the both static and dynamic scheduling. Figure 5 shows a schedule when the proposed cache optimization is applied to macro-task graph in Figure 4(b) for a single processor.…”
Section: Consecutive Execution Of Data Localizable Groupmentioning
confidence: 99%
“…In the proposed cache optimization scheme, a task scheduler for the coarse grain tasks assigns macro-tasks inside a DLG to the same processor as consecutively as possible [14] in addition to "critical path" priority. Fig.3 shows a schedule when the proposed cache optimization is applied to macro-task graph in Fig.2 …”
Section: Loop Aligned Decompositionmentioning
confidence: 99%
“…This paper proposes the padding scheme to reduce conflict misses to improve the performance of the coarse grain task parallel processing. In the cache optimization for coarse grain task parallel processing [14], at first, complier divides loops into smaller loops to fit data size accessed by loops to cache size. Next, the compiler analyzes parallelism among tasks including the divided loops using Earliest Executable Condition analysis and schedules tasks which shared the same data to the same processor so that the tasks can be executed consecutively accessing the shared data on the cache.…”
Section: Introductionmentioning
confidence: 99%
“…Coarse-grained granulation [12] takes place when the time of the execution of data-processing-related operations in a program is longer than the total time of initializing these operations and transferring the data needed for the execution of these operations. This type of granulation corresponds with the nested-loop structure in which the outermost loop of the nest is parallel.…”
Section: Introductionmentioning
confidence: 99%