Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures

Özer, Emre; Banerjia, Sanjeev; Conte, Tom

doi:10.1109/micro.1998.742792

Cited by 102 publications

(148 citation statements)

References 13 publications

Supporting

Mentioning

146

Contrasting

Order By: Relevance

“…The algorithm initially performs partial scheduling of all frequency configurations for "STEP" instructions (Alg.1 lines 11,[14][15][16][17][18][19][20]. This determines the best configuration and stores it into "BFC".…”

Section: The Drivermentioning

confidence: 99%

“…This could speed up the UCIFF scheduler, to reach speeds close to those of the Oracle. It is a unified cluster assignment and scheduling algorithm which shares some similarities with UAS [14] but has several unique attributes: i. It operates on a heterogeneous architecture where clusters operate at different frequencies (as described in Section 3.1).…”

Section: The Drivermentioning

confidence: 99%

“…The first work that combines cluster assignment and instruction scheduling was UAS [14]. Unlike BUG ( [5]), this is list-scheduling based, not critical-path based solution.…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

UCIFF: Unified Cluster Assignment Instruction Scheduling and Fast Frequency Selection for Heterogeneous Clustered VLIW Cores

Porpodas

Cintra

2013

Languages and Compilers for Parallel Computing

View full text Add to dashboard Cite

Abstract. Clustered VLIW processors are scalable wide-issue statically scheduled processors. Their design is based on physically partitioning the otherwise shared hardware resources, a design which leads to both high performance and low energy consumption. In traditional clustered VLIW processors, all clusters operate at the same frequency. Heterogeneous clustered VLIW processors however, support dynamic voltage and frequency scaling (DVFS) independently per cluster. Effectively controlling DVFS, to selectively decrease the frequency of clusters with a lot of slack in their schedule, can lead to significant energy savings. In this paper we propose UCIFF, a new scheduling algorithm for heterogeneous clustered VLIW processors with software DVFS control, that performs cluster assignment, instruction scheduling and fast frequency selection simultaneously, all in a single compiler pass. The proposed algorithm solves the phase ordering problem between frequency selection and scheduling, present in existing algorithms. We compared the quality of the generated code, using both performance and energy-related metrics, against that of the current state-of-the-art and an optimal scheduler. The results show that UCIFF produces better code than the state-ofthe-art, very close to the optimal across the mediabench2 benchmarks, while keeping the algorithmic complexity low.

show abstract

Section: The Drivermentioning

confidence: 99%

Section: The Drivermentioning

confidence: 99%

See 1 more Smart Citation

UCIFF: Unified Cluster Assignment Instruction Scheduling and Fast Frequency Selection for Heterogeneous Clustered VLIW Cores

Porpodas

Cintra

2013

Languages and Compilers for Parallel Computing

View full text Add to dashboard Cite

show abstract

“…Ellis [21] proposed a popular method, BUG (Bottom-Up Greedy), to partition operations on a trace with scheduling in a two-phases sequence. Ozer et al [22] proposed an algorithm, unified-assign-and-schedule (UAS), to combine cluster assignment and instruction scheduling together into a single phase. Nystrom and Eichenberger [23] presented an algorithm for modulo scheduling to perform partitioning with heuristics in a pre-modulo scheduling pass to allow modulo scheduling to be effective.…”

Section: Related Workmentioning

confidence: 99%

PALF: compiler supports for irregular register files in clustered VLIW DSP processors

Lin

You

Lee

2007

Concurrency and Computation

View full text Add to dashboard Cite

SUMMARYWide varieties of register file architectures -developed for embedded processorshave turned to aim at reducing the power dissipation and die size these years, by contrast with the traditional unified register file structures. This article presents a novel register allocation scheme for a clustered VLIW DSP, which is designed with distinctively banked register files in which port access is highly restricted. Whilst the organization of the register files is designed to decrease the power consumption by using fewer port connections, the cluster-based design makes register access across clusters an additional issue, and the switched-access nature of the register file demands further investigations into optimizing register assignment for increasing the instruction-level parallelism. We propose a heuristic algorithm, named ping-pong aware local favorable (PALF) register allocation, to obtain advantageous register allocation that is expected to better utilize irregular register file architectures. The results of experiments performed using a compiler based on the Open Research Compiler (ORC), showed significant performance improvement over the original ORC's approach, which is considered to be an optimized approach for common register file architectures.key words: register allocation; ping-pong register file; DSP; VLIW

show abstract

“…While several works exist in the literature on clustered VLIW processors with a unified L1 data cache ( [18] [19][22] [14] among others), few works exist that deal with the wire delay problem at the memory hierarchy. Among them, the Raw machine [24] has an architectural configuration different to the traditional VLIW clustered core presented in this paper.…”

Section: Related Workmentioning

confidence: 99%

Flexible compiler-managed L0 buffers for clustered VLIW processors

Gibert¹,

Sánchez²,

González³

22nd Digital Avionics Systems Conference. Proceedings (Cat. No.03CH37449)

View full text Add to dashboard Cite

show abstract

Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures

Cited by 102 publications

References 13 publications

UCIFF: Unified Cluster Assignment Instruction Scheduling and Fast Frequency Selection for Heterogeneous Clustered VLIW Cores

UCIFF: Unified Cluster Assignment Instruction Scheduling and Fast Frequency Selection for Heterogeneous Clustered VLIW Cores

PALF: compiler supports for irregular register files in clustered VLIW DSP processors

Flexible compiler-managed L0 buffers for clustered VLIW processors

Contact Info

Product

Resources

About