Abstract-We describe a fast VLSI architecture for full-search motion estimation for the blocks with 7 different sizes in MPEG-4 AVC/H.264. The proposed variable block size motion estimation (VBSME) architecture consists of a 16x16 PE array, an adder tree and comparators to find all 41 motion vectors and their minimum SADs for the blocks of 16x16, 16x8, 8x16, 8x8, 8x4, 4x8 and 4x4. It employs a 2-D datapath and its control of the search area data is simple and regular. The proposed VBSME can achieve 100% PE utilization by employing a preload register and a search data buffer inside each PE and allow real-time processing of 4CIF(704x576) video with 15 fps at 100 Mhz for a search range of [-32~+31].
We present a partial bus-in vertcoding scheme for pow er optim ization of sy stem level bus. In the proposed scheme, we select a su b-group of bus lines in volv ed in b us encod ing to avoid unnecessary inversion of b us lines not in th e subgroup thereby redu cing th e total n um berofbus tran sitions. We propose a heuristic algorithm that selects the s u b-grou p of bus lines for b us encoding. Ex periments on benchmark examples in dicate that the partial bus-in vert coding reduces the tot al bus transitions b y 62.6 on the a verage, compared to that of the unencoded patterns.
ISBN du colloque : 978-1-59593-627-1System-level design methodologies have been introduced as a solution to handle the design complexity of embedded multiprocessor SoC (MPSoC) systems. In this paper we describe a system-level design flow starting from Simulink specification, focusing on concurrent hardware and software design and verification at four different abstraction levels: Simulink Combined Algorithm and Architecture Model (CAAM), Virtual Architecture, Transaction-accurate Model and Virtual Prototype. We used two multimedia applications, Motion-JPEG and H.264, to evaluate this design flow. Experimental results show that our design flow can generate various MPSoC architectures from Simulink CAAM correctly and efficiently, allowing processor and task design space exploration at different abstraction levels
Emerging embedded systems require heterogeneous multiprocessor SoC architectures that can satisfy both high-performance and programmability. However, as the complexity of embedded systems increases, software programming on an increasing number of multiprocessors faces several critical problems, such as multithreaded code generation, heterogeneous architecture adaptation, short design time, and low cost implementation. In this paper, we present a software code generation flow based on Simulink to address these problems. We propose a functional modeling style to capture data-intensive and controldependent target applications, and a system architecture modeling style to seamlessly transform the functional model into the target architecture. Both models are described using Simulink. From a system architecture Simulink model, a code generator produces a multithreaded code, inserting thread and communication primitives to abstract the heterogeneity of the target architecture. In addition, the multithread code generator called LESCEA applies the extensions of dataflow based memory optimization techniques, considering both data and control dependency. Experimental results on a Motion-JPEG decoder and an H.264 decoder show that the proposed multithread code generator enables easy software programming on different multiprocessor architectures with substantially reduced data memory size (up to 68.0%) and code memory size (up to 15.9%).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.