MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture
DOI: 10.1109/micro.1999.809466
|View full text |Cite
|
Sign up to set email alerts
|

Optimizations and oracle parallelism with dynamic translation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
9
0

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 12 publications
(11 citation statements)
references
References 17 publications
2
9
0
Order By: Relevance
“…Prior work has studied the limits of instruction-level parallelism under several idealizations, including a large or infinite instruction window, perfect branch prediction and memory disambiguation, and simple program transformations to remove unnecessary data dependences [4,9,18,20,24,42,49,57,74]. Similar to our limit study, these analyses find that parallelism is often plentiful (>1000×), but very large instruction windows are needed to exploit it (>100K instructions [42,57]).…”
Section: Additional Related Worksupporting
confidence: 61%
“…Prior work has studied the limits of instruction-level parallelism under several idealizations, including a large or infinite instruction window, perfect branch prediction and memory disambiguation, and simple program transformations to remove unnecessary data dependences [4,9,18,20,24,42,49,57,74]. Similar to our limit study, these analyses find that parallelism is often plentiful (>1000×), but very large instruction windows are needed to exploit it (>100K instructions [42,57]).…”
Section: Additional Related Worksupporting
confidence: 61%
“…As each VLIW tree region is translated, a number of optimizations are performed to enhance the available instruction parallelism. These include expansion of register-indirect branches into a series of conditional branches to increase scheduling opportunities [7], copy propagation, combining, load/store telescoping, and unification [8]. Speculation is used aggressively within a translation group, although resuits are committed in-order to the architected processor state to maintain precise exception behavior.…”
Section: Binary Translation Approachmentioning
confidence: 99%
“…DAISY/390 uses an alternative approach: instead of adding a guarding test to each translation unit, we use incremental dataflow analysis between blocks to minimize the need for code which checks compilation assumptions about the contents of base registers. When a block is translated, dataflow information for the current code block is generated for code optimization techniques performed at the code block level [8]. This includes information such as the constant propagation.…”
Section: Resolving Branch Target Addressesmentioning
confidence: 99%
“…Finally, dynamic optimizers can perform profitable optimizations such as partial inlining of functions and conditional branch elimination that would be too expensive to perform statically. SDT systems that perform dynamic optimization include Dynamo, (5) DBT, (6) and Voss and Eigenmann's remote dynamic program Optimization system. (7) Some of the binary translators previously described also perform some dynamic optimization (e,g., DAISY, FX!32, and Transmeta's Code Morphing).…”
Section: Introductionmentioning
confidence: 99%