Software pipelining is a loop scheduling technique that extracts parallelism out of loops by o verlapping the execution of several consecutive iterations. Due to the overlapping of iterations, schedules impose high register requirements during their execution. A s c hedule is valid if it requires at most the number of registers available in the target architecture. If not, its register requirements have to be reduced either by decreasing the iteration overlapping or by spilling registers to memory. In this paper we describe a set of heuristics to increase the quality of register{constrained modulo schedules. The heuristics decide between the two previous alternatives and de ne criteria for e ectively selecting spilling candidates. The heuristics proposed for reducing the register pressure can be applied to any software pipelining technique. The proposals are evaluated using a register{ conscious software pipeliner on a workbench c o m p o s e d o f a large set of loops from the Perfect Club benchmark and a set of processor con gurations. Proposals in this paper are compared against a previous proposal already described in the literature. For one of these processor con gurations and the set of loops that do not t in the available registers (32), a speed{up of 1.68 and a reduction of the memory tra c by a factor of 0.57 are achieved with an a ordable increase in compilation time. For all the loops, this represents a speed{ up of 1.38 and a reduction of the memory tra c by a factor of 0.7.
Software pipelining is an instruction scheduling technique that exploits the instruction level parallelism (ILP) available in loops by overlapping operations from various successive loop iterations. The main drawback of aggressive software pipelining techniques is their high register requirements. If the requirements exceed the number of registers available in the target architecture, some steps need to be applied to reduce the register pressure (incurring some performance degradation): reduce iteration overlapping or spilling some lifetimes to memory. In the first part, we propose a set of heuristics to improve the spilling process and to better decide between adding spill code or directly decreasing the execution rate of iterations. The experimental evaluation, over a large number of representative loops and for a processor configuration, reports an increase in performance by a factor of 1.29 and a reduction of memory traffic by a factor of 1.36. In the second part, we analyze the use of backtracking and propose a novel approach for simultaneous instruction scheduling and register spilling in modulo scheduling: MIPS (modulo scheduling with integrated register spilling). The experimental evaluation reports an increase in performance by a factor of 1.46 and a reduction of the memory traffic by a factor of 1.66 (or an additional 1.13 and 1.22 with regard to the proposal in the first part). These improvements are achieved at the expense of a reasonable increase in the compilation time.Peer ReviewedPostprint (published version
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.