The size and complexity of current custom VLSI have forced the use of high-level programming languages to describe hardware, and compiler and synthesis technology to map abstract designs into silicon. Many applications operating on large streaming data usually require a custom VLSI because of high performance or low power restrictions. Since the data processing is typically described by loop constructs in a high-level language, loops are the most critical portions of the hardware description and special techniques are developed to optimally synthesize them. In this thesis, we introduce a new method for mapping nested loops into hardware and pipelining them efficiently. The technique achieves fine-grain parallelism even on strong intra-and inter-iteration datadependent inner loops and, by economically sharing resources, improves performance at the expense of a small amount of additional area. We implemented the transformation within the Nimble Compiler environment and evaluated its performance on several signal-processing benchmarks. The method achieves up to 2x increase in the area efficiency compared to the best known optimization techniques.
What do we mean by single-chip multiprocessors? Harr: Let's assume that there's at least one or more instruction-set processors on the chip. What needs to be defined next is whether there are multiple instruction-set processors on a chip, or one instruction-set processor and many custom data path processors with possibly variable control. Gupta: With instruction-set processors, it's not just a question of what that underlying component does, but also at what level the software is integrated? Are we integrating at the compiler level or at the lowest level? In that sense, "system on chip" is pretty broad. "Systems on chip with reprogrammable DSP processors" is also very broad-"system-on-chip multiprocessors" is a much narrower term to use. Olukotun: The classic multiprocessor chip has several instruction-set processors. We have everything from symmetric multiprocessors that share a central memory, to ones with a network between the processors, which are communicating via message passing. Putting all these architectures on a single chip is what people classically mean when they talk about multiprocessor chips. The processors may not necessarily all be the same; they could be specialized. Jerraya: The multiprocessor concept, in which "multiprocessor" means having several master processors on a chip, is central to our discussion. These processors can be programmable instruction sets or they can be specific hardware. But the key concept is that you have several masters, each with its own bus. The main issue is how to connect these through a processor network. Harr: We can now put more than just the instruction-set processor on a chip. With floatingpoint operations, we still have 90% of the chip left over. With all this extra real estate available, we could look at multiple instruction-set processors or heterogeneous instruction-set processors with custom data paths intermixed-all sharing buses, memory hierarchies, and so forth.
Customers can just do it a whole lot more effi ciently when they're in this nextgeneration environment, and that's what we've always tried to do: get folks to be able to build their systems quicker.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.