Abstract. We describe a novel technique for code selection based on data-flow graphs, which arise naturally in the domain of digital signal processing. Code selection is the optimized mapping of abstract operations to partial machine instructions. The presented method performs an important task within the retargetable microcode generator CBC, which was designed to cope with the requirements arising in the context of custom digital signal processor (DSP) programming. The algorithm exploits a graph representation in which control-flow is modeled by scopes.
IntroductionIn the domain of medium-throughput digital signal processing, micro-programmable processor cores are frequently chosen for system realization. By adding dedicated hardware (accelerator paths), these cores are tailored to the needs of new applications. Optimized processor modules can be reused, which is a major benefit compared to high-level synthesis [28] where a completely new design is developed for each application. Because of the application-specific add-ons and the rather short lifetimes of a specific design, there is a need for retargetable software development tools, especially code-generators.
OverviewIn the next section we will shortly discuss several related approaches to code generation and point out some differences of our system. Section 3 introduces the overall architecture and functionality of the CBC code generator. Section 4 explains the code selection task and the basic techniques used. In section 5 our algorithm is presented. We conclude the paper with experimental results. Points of major differences between our code selection approach and similar tasks in "classic" code generation (CG) are: -Complexity of datapaths. CBC has to deal with highly specialized and optimized datapaths. The hardware units make the efficient execution of frequently used operation sequences possible. Operation patterns for the functional units of these datapaths are much more complex than for standard microprocessors.-Type-handling. DSP algorithms may employ a large variety of different word lengths and numerical types. The hardware operators are restricted to fixed word lengths. A correct mapping must always be found. In most CG work this topic is neglected because language definitions (and hence the compilers) are restricted to "implementation-dependent" types.
-Evaluation order. Approaches like [6,7] dealing with code selection assume a fixed evaluation order, which is usually derived from the imperative source code. There is no explicit scheduling phase included in the back-ends. Commonly, register allocation is performed during code selection. Most of the time this is done by graph coloring [4] or "on-the-fly".Parallelism of functional blocks. Most DSP architectures contain several functional units that work in parallel. Therefore, the final code cannot be emitted during or immediately after the code selection phase because partial instructions must be "compacted" into complete instructions at a later stage of compilation exploiting the possible para...