1This paper presents a novel C simulation based hardware -software co-verification environment and co-design me -thodology for computation-intensive applications. For reducing verification time and verifying the design adeq -uately, this paper uses hierarchical verification method. At IP level, we propose C reference model based verification strategy. An example of a low cost MPEG-4 decode SoC is use to illustrate our approach's validity.
In this paper, we propose a novel design for one fast embedded face detection system, which can be applied in many real-time applications, such as teleconferencing, user interfaces, and security access control. Our framework includes 3 parts: one fast face detection method based on optimized AdaBoost algorithm with high speed and high detection rate, one SOC hardware framework to speed up detection operations and one software distribution strategy to optimize the memory sub-system. In embedded system domain, face detection is a great challenge because of its limited hardware source and low frequency. Our system prototype is built on the base of one SOC named 'Garfield' to test our design. Experimental results illuminate the validity of our face detection system.
SUMMARYOne of the largest challenges for coarse-grained reconfigurable arrays (CGRAs) is how to efficiently map applications. The key issues for mapping are (1) how to reduce the memory bandwidth, (2) how to exploit parallelism in algorithms and (3) how to achieve load balancing and take full advantage of the hardware potential. In this paper, we propose a novel parallelism scheme, called 'Hybrid partitioning', for mapping a H.264 high definition (HD) decoder onto REMUS-II, a CGRA systemon-chip (SoC). Combining good features of data partitioning and task partitioning, our methodology mainly consists of three levels from top to bottom: (1) hybrid task pipeline based on slice and macroblock (MB) level; (2) MB row-level data parallelism; (3) sub-MB level parallelism method. Further, on the sub-MB level, we propose a few mapping strategies such as hybrid variable block size motion compensation (Hybrid VBSMC) for MC, 2D-wave for intra 4 × 4, parallel processing order for deblocking. With our mapping strategies, we improved the algorithm's performance on REMUS-II. For example, with a luma 16 × 16 MB, the Hybrid VBSMC achieves 4 times greater performance than VBSMC and 2.2 times greater performance than fixed 4 × 4 partition approach. Finally, we achieve 1080p@33fps H.264 high-profile (HiP)@level 4.1 decoding when the working frequency of REMUS-II is 200 MHz. Compared with typical hardware platforms, we can achieve better performance, area, and flexibility. For example, our performance achieves approximately 175% improvement than that of a commercial CGRA processor XPP-III while only using 70% of its area.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.