&EMBEDDED SYSTEMS USUALLY consist of processors that execute domainspecific applications. Much of their functionality is implemented in software, which runs on one or multiple processors, leaving only the high-performance functions implemented in hardware. Most typical embedded systems (such as TVs, cellular phones, and MP3 players) run multimedia or telecom applications that exhibit dynamic behavior, and their execution costs (such as the number of cycles and energy) depend on the input data. Moreover, these applications are often implemented as a main loop, called the loop of interest, that is executed over and over again, reading, processing, and writing out individual stream objects (see Figure 1). A stream object could be a bit belonging to a compressed bitstream representing a coded video clip, or it could be a macroblock, video frame, audio sample, or network package. Usually, these applications must deliver a given throughput (number of objects per second), which imposes a time constraint on each loop iteration. The read part of the loop of interest takes a stream object from the input stream and separates it into a header and the object's data. The processing part consists of several kernels. The write part sends the processed data to output devices, such as a screen or speakers, and saves the application's internal state for further use; for example, in a video decoder, the previous decoded frame might be necessary to decode the current frame. The dynamism existing in modern applications leads to the use of different kernels for each stream object, depending on the object type. The actions executed in a particular loop iteration form the application's internal operation mode.In this article, we describe a method that provides a systematic way of detecting and exploiting, at design time and runtime, the different internal operation modes. The fact that applications have different internal operation modes has not been fully exploited in embedded-system design thus far. Our approach combines a static analysis and profiling of the system at design time with information collected at runtime about the system's environment. By knowing a system's possible operation modes and information about their resource consumption at design time, it is possible to make specific and aggressive design decisions for each operation mode at different design steps.To avoid complexity problems, we cluster the operation modes that are closely related to one another from a resource consumption perspective in application scenarios, distinguishing the truly different operation modes via different scenarios. It is then possible to derive a faster or lower-energy implementation (for example, by using different source-code optimizations per scenario) or a better estimation of required resources (such as the number of computation cycles or bandwidth). This leads to a smaller, lessexpensive, and more energy-efficient system that can deliver the required performance.
581A design method for handling increasingly dynamic real-time embeddeds...