Abstract-High computational effort in modern image processing applications like medical imaging or high-resolution video processing often demands for massively parallel special purpose architectures in form of FPGAs or ASICs. However, their efficient implementation is still a challenge, as the design complexity causes exploding development times and costs. This paper presents a new design flow which permits to specify, analyze, and synthesize complex image processing algorithms. A novel buffer requirement analysis allows exploiting possible tradeoffs between required communication memory and computational logic for multi-rate applications. The derived schedule and buffer results are taken into account for resource optimized synthesis of the required hardware accelerators. Application to a multi-resolution filter shows that buffer analysis is possible in less than one second and that scheduling alternatives influence the required communication memory by up to 24% and the computational resources by up to 16%.
I. INTRODUCTIONAs design complexity is becoming a major barrier for technical progress because of expensive and error-prone development, new design methodologies raising the level of abstraction are becoming increasingly popular. Simulink [1] or SystemC based high-level synthesis [2] tools for instance permit to compose complex systems by communicating blocks. However, these approaches do not allow for system-level analysis like determination of required communication buffer sizes, as the blocks can contain arbitrarily complex operations. Alternative approaches like [3], [4] are restricted to a subset of sequential languages like C. However, extraction of the contained parallelism is challenging, especially as analysis on individual statements can get computationally expensive [5].In order to address these aspects, this paper presents a novel design flow for high-level synthesis of complex multi-rate image processing applications containing up-and downsamplers. It extends existing previous work by usage of latticebased buffer analysis which considers different scheduling alternatives for multi-rate systems. As the obtained results are directly taken into account during hardware synthesis, we are able to exploit tradeoffs between required communication memory and computational logic. Furthermore, in contrast to many other approaches, analysis of the overall system does not rely on solving Integer Linear Programs (ILPs) in case of acyclic problems. Instead ILPs are only required for local analysis like actor synthesis or dependency calculation in order to assure good scaling properties of our design flow.