This paper describes a high performance scalable massively parallel single-instruction multiple-data (SIMD) processor and power/area efficient real-time image processing. The SIMD processor combines 4-bit processing elements (PEs) with SRAM on a small area and thus enables at the same time a high performance of 191 GOPS, a high power efficiency of 310 GOPS/W, and a high area efficiency of 31.6 GOPS/mm . The applied pipeline architecture is optimized to reduce the number of controller overhead cycles so that the SIMD parallel processing unit can be utilized during up to 99% of the operating time of typical application programs. The processor can be also optimized for low cost, low power, and high performance multimedia system-on-a-chip (SoC) solutions. A combination of custom and automated implementation techniques enables scalability in the number of PEs. The processor has two operating modes, a normal frequency (NF) mode for higher power efficiency and a double frequency (DF) mode for higher performance. The combination of high area efficiency, high power efficiency, high performance, and the flexibility of the SIMD processor described in this paper expands the application of real-time image processing technology to a variety of electronic devices.Index Terms-Image processor, power efficiency, area efficiency, SIMD, scalable architecture, fine grained processing element.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.