Proceedings of the 22nd Annual International Conference on Supercomputing 2008
DOI: 10.1145/1375527.1375541
|View full text |Cite
|
Sign up to set email alerts
|

The shared-thread multiprocessor

Abstract: This paper describes initial results for an architecture called the Shared-Thread Multiprocessor (STMP). The STMP combines features of a multithreaded processor and a chip multiprocessor; specifically, it enables distinct cores on a chip multiprocessor to share thread state. This shared thread state allows the system to schedule threads from a shared pool onto individual cores, allowing for rapid movement of threads between cores.This paper demonstrates and evaluates three benefits of this architecture: (1) By… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
15
0

Year Published

2011
2011
2019
2019

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 24 publications
(15 citation statements)
references
References 20 publications
0
15
0
Order By: Relevance
“…Previous work [3,38] describes support mechanisms for migrating register state in order to decrease the latency of thread activation and deactivation; however, performance subsequent to migration still suffers due to cold-cache effects. Our work is complimentary; we specifically address the post-migration cache misses which limit the gains of those techniques.…”
Section: Background and Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Previous work [3,38] describes support mechanisms for migrating register state in order to decrease the latency of thread activation and deactivation; however, performance subsequent to migration still suffers due to cold-cache effects. Our work is complimentary; we specifically address the post-migration cache misses which limit the gains of those techniques.…”
Section: Background and Related Workmentioning
confidence: 99%
“…The cores of our CMP feature hardware support for thread activation and deactivation, as found in prior studies of thread scheduling [3,38]. While those works used hardware support to implement scheduling and time-sharing policies, we use it simply for adding and removing threads from cores.…”
Section: Baseline Multicore Architecturementioning
confidence: 99%
“…Pseudo-parallelism share the some technical issues in common, related to the need for synchronisation between running programs. Figure 1 is somewhat detailed view of the running of the four programs-labelled P1, P2, P3, and P4-in multiprogramming mode [5]. The top part of the figure, all the four programs seems to be running in parallel.…”
Section: Baseline Architecturementioning
confidence: 99%
“…If there is data in cache on the user processor that must be accessed by the OS core, it must be transferred to the OS core (automatically handled by the coherence mechanism). The aggressive scheme is based on the technique proposed by Brown and Tullsen [9] and is assumed to incur a 100 cycle migration latency. They advocate hardware support for book-keeping and thread scheduling (normally done in software by an OS or virtual machine).…”
Section: Background and Motivationmentioning
confidence: 99%
“…The work by Brown and Tullsen [9], for example, attempts to design a low-latency process migration mechanism that is an important technology for OS off-load. Similarly, in this paper we assume that OS off-load is a promising approach and we attempt to resolve another component of OS off-load that may be essential for its eventual success, viz, the decision-making process that determines which operations should be off-loaded.…”
Section: Introductionmentioning
confidence: 99%