Emerging embedded systems require heterogeneous multiprocessor SoC architectures that can satisfy both high-performance and programmability. However, as the complexity of embedded systems increases, software programming on an increasing number of multiprocessors faces several critical problems, such as multithreaded code generation, heterogeneous architecture adaptation, short design time, and low cost implementation. In this paper, we present a software code generation flow based on Simulink to address these problems. We propose a functional modeling style to capture data-intensive and controldependent target applications, and a system architecture modeling style to seamlessly transform the functional model into the target architecture. Both models are described using Simulink. From a system architecture Simulink model, a code generator produces a multithreaded code, inserting thread and communication primitives to abstract the heterogeneity of the target architecture. In addition, the multithread code generator called LESCEA applies the extensions of dataflow based memory optimization techniques, considering both data and control dependency. Experimental results on a Motion-JPEG decoder and an H.264 decoder show that the proposed multithread code generator enables easy software programming on different multiprocessor architectures with substantially reduced data memory size (up to 68.0%) and code memory size (up to 15.9%).
1-59593-381-6Reduction of the on-chip memory size is a key issue in video codec system design. Because video codec applications involve complex algorithms that are both data-intensive and controldependent, memory optimization based on global and precise analysis of data and control dependency is required. We generate a memory-efficient C code from a restricted Simulink model, which can represent both data and control dependency explicitly, by applying two buffer memory optimization techniques: copy removal and buffer sharing. Copy removal is performed while parsing the Simulink model. Buffer sharing requires global scheduling and formal lifetime analysis. Experimental results on an H.264 video decoder show that the buffer memory size and execution time of the C code generated by the proposed method are 71% and 32% less than those of the C code produced by Simulink?s C code generator, respectively. When compared to the hand written C code, the memory size was reduced by 27% while its execution time was increased by only 3%
Current multimedia applications demand complex heterogeneous multiprocessor architectures with specific communication infrastructure in order to achieve the required performances. Programming these architectures usually results in writing separate low-level code for the different processors (DSP, microcontroller), implying late global validation of the overall application with the hardware platform. We propose a platform-based software design flow able to efficiently use the resources of the architecture and allowing easy experimentation of several mappings of the application onto the platform resources. We use a high-level environment to capture both application and architecture initial representations. An executable software stack is generated automatically for each processor from the initial model. The software generation and validation is performed gradually corresponding to different software abstraction levels. Specific software development platforms (abstract models of the architecture) are generated and used to allow debugging of the different software components with explicit hardware-software interaction. We applied this approach on a multimedia platform, involving a high performance DSP and a RISC processor, to explore communication architecture and generate an efficient executable code for a multimedia application. Based on automatic tools, the proposed flow increases productivity and preserves design quality. ; Pier Stanislao Paolucci, ATMEL Roma Advanced DSP,
Heterogeneous MPSoCs present unique opportunities for emerging embedded applications, which require both high-performance and programmability. Although, software programming for these MPSoC architectures requires tedious and error-prone tasks, thereby automatic code generation tools are required. A code generation method based on fine-grain specification can provide more design space and optimization opportunities, such as exploiting fine-level parallelism and more efficient partitions. However, when partitioned, fine-grain models may require a large number of inter-processor communications, decreasing the overall system performance. This paper presents a Simulink-based multithread code generation method, which applies Message Aggregation optimization technique to reduce the number of interprocessor communications. This technique reduces the communication overheads in terms of execution time by reduction on the number of messages exchanged and in terms of memory size by the reduction on the number of channels. The paper also presents experiment results for one multimedia application, showing performance improvements and memory reduction obtained with Message Aggregation technique.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.