Proceedings of the 13th International Conference on Supercomputing 1999
DOI: 10.1145/305138.305214
|View full text |Cite
|
Sign up to set email alerts
|

Clustered speculative multithreaded processors

Abstract: In this paper we present a processor microarchitecture that can simultaneously execute multiple threads and has a clustered design for scalability purposes. A main feature of the proposed microarchitecture is its capability to spawn speculative threads from a single-thread application at run-time. These speculative threaak use otherwise idle resources of the machine.Spawning a speculative thread involves predicting its control flow as well as its dependences with other threads and the values that flow through … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
132
0
2

Year Published

2000
2000
2012
2012

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 149 publications
(134 citation statements)
references
References 38 publications
0
132
0
2
Order By: Relevance
“…A speculative thread in this model is identified by a SP-CQIP pair [14], where SP stands for the Spawning Point, i.e. the instruction in the execution stream where the speculative thread's execution is triggered.…”
Section: A Speculative Threadsmentioning
confidence: 99%
See 2 more Smart Citations
“…A speculative thread in this model is identified by a SP-CQIP pair [14], where SP stands for the Spawning Point, i.e. the instruction in the execution stream where the speculative thread's execution is triggered.…”
Section: A Speculative Threadsmentioning
confidence: 99%
“…the instruction from which the speculative thread begins execution. The choice of these pairs strongly affects the performance achieved by the system [14].…”
Section: A Speculative Threadsmentioning
confidence: 99%
See 1 more Smart Citation
“…is difficult because of pointer aliasing, irregular array accesses, and complex control flow. Thread-level speculation (TLS) [3,6,9,11,16,22,24,26] facilitates the parallelization of such applications by allowing potentially dependent threads to execute in parallel while maintaining the original sequential semantics of the programs through runtime checking. Although researchers have proposed numerous techniques for providing the proper hardware [17,18,23,25] and compiler [27][28][29] support for improving the efficiency of TLS, how to provide adequate compiler support for decomposing sequential programs into parallel threads that can deliver the desired performance has not yet been explored with the proper depth.…”
Section: Introductionmentioning
confidence: 99%
“…rePlay [21] does perform DBO on short atomic traces (16 to 256 instructions long), but they are not suitable for parallelization purposes. Before the many-core era, some systems were proposed [22,23,24,25] to use hardware-only technologies to speculate multiple consecutive atomic traces and execute them simultaneously on different functional units. In order to achieve reasonable speculation accuracy, however, these systems construct very short traces, which necessitates ultra-low communication latency to support program state transfer.…”
Section: Trace Construction and Predictionmentioning
confidence: 99%