2010
DOI: 10.1145/1735970.1736055
|View full text |Cite
|
Sign up to set email alerts
|

Flexible architectural support for fine-grain scheduling

Abstract: To make efficient use of CMPs with tens to hundreds of cores, it is often necessary to exploit fine-grain parallelism. However, managing tasks of a few thousand instructions is particularly challenging, as the runtime must ensure load balance without compromising locality and introducing small overheads. Software-only schedulers can implement various scheduling algorithms that match the characteristics of different applications and programming models, but suffer significant overheads as they synchronize and co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
31
0

Year Published

2010
2010
2022
2022

Publication Types

Select...
7
2

Relationship

2
7

Authors

Journals

citations
Cited by 34 publications
(31 citation statements)
references
References 52 publications
0
31
0
Order By: Relevance
“…Task queue virtualization: Applications may create an unbounded number of tasks and schedule them for a future time. Swarm uses an overflow/underflow mechanism to give the illusion of unbounded hardware task queues [27,41,64]. When the per-tile task queue is nearly full, the task unit dispatches a special, non-speculative coalescer task to one of the cores.…”
Section: Handling Limited Queue Sizesmentioning
confidence: 99%
“…Task queue virtualization: Applications may create an unbounded number of tasks and schedule them for a future time. Swarm uses an overflow/underflow mechanism to give the illusion of unbounded hardware task queues [27,41,64]. When the per-tile task queue is nearly full, the task unit dispatches a special, non-speculative coalescer task to one of the cores.…”
Section: Handling Limited Queue Sizesmentioning
confidence: 99%
“…Delegation schemes divide shared data among threads and send updates to the corresponding thread, using shared-memory queues [11] or active messages [55,61]. Delegation is common in architectures that combine shared memory and message passing [55,64] and in NUMA-aware data structures [11,12].…”
Section: Software Techniquesmentioning
confidence: 99%
“…Delegation is common in architectures that combine shared memory and message passing [55,64] and in NUMA-aware data structures [11,12]. Delegation is the software counterpart to RMOs, and is subject to the same tradeoffs: it reduces data movement and synchronization, but incurs global traffic and serialization.…”
Section: Software Techniquesmentioning
confidence: 99%
“…The growing popularity of task-based models has already motivated research into explicit hardware support for tasks. Carbon [13] and ADM [22] use hardware task queues to support fast task dispatch and stealing, whereas the Hyperprocessor [11] manages global dependencies using a universal register file.…”
Section: Related Workmentioning
confidence: 99%