1989
DOI: 10.1145/68182.68209
|View full text |Cite
|
Sign up to set email alerts
|

Limits on multiple instruction issue

Abstract: This paper investigates the limitations on designing a processor which can sustain an execution rate of greater than one instruction per cycle on highly-optimized, non-scientific applications. We have used trace-driven simulations to determine that these applications contain enough instruction independence to sustain an instruction rate of about two instructions per cycle. In a straightforward implementation, cost considerations argue strongly against decoding more than two instructions in one cycle. Given thi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

1990
1990
2015
2015

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 54 publications
(9 citation statements)
references
References 9 publications
0
9
0
Order By: Relevance
“…This is similar to other parallelism limit studies [31,20], except that it is targeted towards fine-grain multithreaded PIM systems, in which the threads are primarily identified by the compiler. Threadlets exist within a basic block, which is a sequence of instructions occurring between two branch instructions in the program trace.…”
Section: Threadletsmentioning
confidence: 65%
“…This is similar to other parallelism limit studies [31,20], except that it is targeted towards fine-grain multithreaded PIM systems, in which the threads are primarily identified by the compiler. Threadlets exist within a basic block, which is a sequence of instructions occurring between two branch instructions in the program trace.…”
Section: Threadletsmentioning
confidence: 65%
“…Authors have found that block level scheduling has not provided enough parallelism for 8-issue parallelism. Similarly, the results in [34] present that scheduling beyond basic blocks can support performance which is more than two instructions per cycle on average. Nevertheless, this performance is only possible if necessary memory bandwidth is provided.…”
Section: Benchmarkmentioning
confidence: 78%
“…Exploring available ILP from a given program has been crucial for application specific VLIW processors in order to reduce compiler effort and prevent redundant hardware. State of the art ILP extraction algorithms are based on either instruction traces [33,6,4,5,[34][35][36][37] or dependency graphs [38,39,15,[40][41][42].…”
Section: Related Workmentioning
confidence: 99%
“…A recent study reported that in non-numerical programs the average degree of instruction-level parallelism bounded by basic blocks is around 2 [2]. Another independent study generally confirmed this result but also reported that a significant increase in the degree of instruction-level parallelism would result, from 2.3 to 4.1, if control dependence in the programs were removed [3]. This is an interesting result since the increase from 2.3 to 4.1 is almost 80%.…”
Section: Introductionmentioning
confidence: 73%