Lazy code motion

Knoop, Jens; Rüthing, Oliver; Steffen, Bernhard

doi:10.1145/143095.143136

Cited by 143 publications

(48 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Because this technique is based on code motion, its effect may be limited if there exist critical edges leading from a node with more than one successor to a node with more than one predecessor, such as an edge from node 1 to node 2 in Fig. 5 (a) [11], [12], [14]. We assume that a critical edge is removed by inserting a synthetic node as shown by the insertion of…”

Section: Preliminariesmentioning

confidence: 99%

Demand-driven Partial Dead Code Elimination

Takimoto

2012

IPSJ Online Transactions

View full text Add to dashboard Cite

Partial dead code elimination (PDE) is a powerful code optimization technique that extends dead code elimination based on code motion. PDE eliminates assignments that are dead on some execution paths and alive on others. Hence, it can not only eliminate partially dead assignments but also move loop-invariant assignments out of loops. These effects are achieved by interleaving dead code elimination and code sinking. Hence, it is important to capture second-order effects between them, which can be reflected by repetitions. However, this process is costly. This paper proposes a technique that applies PDE to each assignment on demand. Our technique checks the safety of each code motion so that no execution path becomes longer. Because checking occurs on a demand-driven basis, the checking range may be restricted. In addition, because it is possible to check whether an assignment should be inserted at the blocking point of the code motion by performing a demand-driven analysis, PDE analysis can be localized to a restricted region. Furthermore, using the demand-driven property, our technique can be applied to each statement in a reverse postorder for a reverse control flow graph, allowing it to capture many second-order effects. We have implemented our technique as a code optimization phase and compared it with previous studies in terms of optimization and execution costs of the target code. As a result, our technique is as efficient as a single application of PDE and as effective as multiple applications of PDE.

show abstract

Section: Preliminariesmentioning

confidence: 99%

Demand-driven Partial Dead Code Elimination

Takimoto

2012

IPSJ Online Transactions

View full text Add to dashboard Cite

show abstract

“…After memory redundancies are detected, code transformations are used to eliminate the redundant instructions. We have used two different techniques to perform the elimination phase: traditional common subexpression elimination (CSE) [9] and partial redundancy elimination (PRE) [24,18,11]. Using memory value numbering results, we can easily extend scalar CSE and PRE and build unified frameworks that remove both scalar and memory-based redundancies.…”

Section: Removing Memory Redundancymentioning

confidence: 99%

“…The key idea behind partial redundancy elimination (pre) and lazy code motion is to find computations that are redundant on some, but not all paths [24,18,11]. Given an expression e at point p that is redundant on some subset of the paths that reach p, the transformation inserts evaluations of e on paths where it had not been, to make the evaluation at p redundant on all paths.…”

Section: Partial Redundancy Eliminationmentioning

confidence: 99%

See 1 more Smart Citation

Memory Redundancy Elimination to Improve Application Energy Efficiency

Cooper

2004

Languages and Compilers for Parallel Computing

View full text Add to dashboard Cite

Application energy consumption has become an increasingly important issue for both high-end microprocessors and mobile and embedded devices. A multitude of circuit and architecture-level techniques have been developed to improve application energy efficiency. However, relatively less work studies the effects of compiler transformations in terms of application energy efficiency. In this paper, we use energyestimation tools to profile the execution of benchmark applications. The results show that energy consumption due to memory instructions accounts for a large share of total energy. An effective compiler technique that can improve energy efficiency is memory redundancy elimination. It reduces both application execution cycles and the number of cache accesses. We evaluate the energy improvement over 12 benchmark applications from SPEC2000 and MediaBench. The results show that memory redundancy elimination can significantly reduce energy in the processor clocking network and the instruction and data caches. The overall application energy consumption can be reduced by up to 15%, and the reduction in terms of energy-delay product is up to 24%.

show abstract

“…Morel and Renvoise 24] rst proposed a bidirectional bit-vector algorithm for the suppression of partial redundancies. The complexity of bidirectional problems for bit-vector representations of data ow information was addressed by later papers 8,20,9,19]. This paper applies the techniques from 24] and 9] for eliminating partially redundant communication.…”

Section: Related Workmentioning

confidence: 99%

A unified framework for optimizing communication in data-parallel programs

Gupta

Schonberg

Srinivasan

1996

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

This paper presents a framework, based on global array data-ow analysis, to reduce communication costs in a program being compiled for a distributed memory machine. We introduce available section descriptor, a novel representation of communication involving array sections. This representation allows us to apply techniques for partial redundancy elimination to obtain powerful communication optimizations. With a single framework, we are able to capture optimizations like (i) vectorizing communication, (ii) eliminating communication that is redundant on any control ow path, (iii) reducing the amount of data being communicated, (iv) reducing the number of processors to which data must be communicated, and (v) moving communication earlier to hide latency, and to subsume previous communication. We show that the bidirectional problem of eliminating partial redundancies can be decomposed into simpler unidirectional problems even in the context of an array section representation, which makes the analysis procedure more e cient. We present results from a preliminary implementation of this framework, which are extremely encouraging, and demonstrate the e ectiveness of this analysis in improving the performance of programs.Distributed memory architectures are becoming increasingly popular as a viable and cost-e ective method of building massively parallel computers. However, the absence of global address space, and consequently, the need for explicit message passing among processes makes these machines very di cult to program. This has motivated the design of languages like High Performance Fortran 10], which allow the programmer to write sequential or shared-memory parallel programs that are annotated with directives specifying data decomposition. The compilers for these languages are responsible for partitioning the computation, and generating the communication necessary to fetch values of non-local data referenced by a processor. A number of such prototype compilers have been developed 18,33,23,26,22,25,3,15,28].Since the cost of interprocessor communication is usually orders of magnitude higher than the cost of accessing local data, it is extremely important for the compilers to optimize communication. The most common optimizations include message vectorization 18, 33], using collective communication 14, 23], and overlapping communication with computation 18]. However, most compilers perform little global analysis of the communication requirements across di erent loop nests. This precludes general optimizations, such as redundant communication elimination, or carrying out extra communication inside one loop nest if it subsumes communication required in the next loop nest.This paper presents a framework, based on global array data-ow analysis, to reduce communication in a program. We apply techniques for partial redundancy elimination, discussed in the context of eliminating redundant computation by Morel and Renvoise 24], and later re ned by other researchers 8,20,9]. The conventional approach to data-ow analysis regards...

show abstract

Lazy code motion

Cited by 143 publications

References 24 publications

Demand-driven Partial Dead Code Elimination

Demand-driven Partial Dead Code Elimination

Memory Redundancy Elimination to Improve Application Energy Efficiency

A unified framework for optimizing communication in data-parallel programs

Contact Info

Product

Resources

About