Abstract-A heterogeneous many-core object recognition processor is proposed to realize robust and efficient object recognition on real-time video of cluttered scenes. Unlike previous approaches that simply aimed for high GOPS/W, we aim to achieve high Effective GOPS/W, or EGOPS/W, which only counts operations carried out on meaningful regions of an input image. This is achieved by the Unified Visual Attention Model (UVAM) which confines complex Scale Invariant Feature Transform (SIFT) feature extraction to meaningful object regions while rejecting meaningless background regions. The Intelligent Inference Engine (IIE), a mixed-mode neuro-fuzzy inference system, performs the top-down familiarity attention of the UVAM which guides attention toward pre-learned objects. Weight perturbation-based learning of the IIE ensures high attention precision through online adaptation. The SIFT recognition is accelerated by an optimized array of 4 20-way SIMD Vector Processing Elements, 32 MIMD Scalar Processing Elements, and 1 Feature Matching Processor. When processing 30 fps 640 480 video, the 50 mm 2 object recognition processor implemented in a 0.13 m process achieves 246 EGOPS/W, which is 46% higher than the previous work. The average power consumption is only 345 mW.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.