In this paper we present a scalable dataflow hardware architecture optimized for the computation of generalpurpose vision algorithms-neuFlow-and a dataflow compiler-luaFlow-that transforms high-level flow-graph representations of these algorithms into machine code for neuFlow. This system was designed with the goal of providing real-time detection, categorization and localization of objects in complex scenes, while consuming 10 Watts when implemented on a Xilinx Virtex 6 FPGA platform, or about ten times less than a laptop computer, and producing speedups of up to 100 times in real-world applications. We present an application of the system on street scene analysis, segmenting 20 categories on 500 × 375 frames at 12 frames per second on our custom hardware neuFlow.
Abstract-In this paper we present a scalable hardware architecture to implement large-scale convolutional neural networks and state-of-the-art multi-layered artificial vision systems. This system is fully digital and is a modular vision engine with the goal of performing real-time detection, recognition and segmentation of mega-pixel images. We present a performance comparison between a software, FPGA and ASIC implementation that shows a speed up in custom hardware implementations.
Other models like HMAX-type models (Serre et al., 2005; Mutch and Lowe, 2006) and convolutional networks use two more layers of successive feature extractors. Different training algorithms have been used for learning the parameters of convolutional networks. In LeCun et al. (1998b) and Huang and LeCun (2006), pure supervised learning is used to update the parameters. However, recent works have focused on training with an auxiliary task (
Abstract-This paper proposes an algorithm for feedforward categorization of objects, and in particular human postures in realtime video sequences from address-event temporal-difference image sensors. The system employs an innovative combination of event-based hardware and bio-inspired software architecture. An event-based temporal difference image sensor is used to provide input video sequences, while a software module extracts size and position invariant line features inspired by models of the primate visual cortex. The detected line features are organized into vectorial segments. After feature extraction, a modified linesegment Hausdorff-distance classifier combined with on-the-fly cluster based size and position invariant categorization. The system can achieve about 90% average successful rate in the categorization of human postures, while using only a small number of train samples. Compared to state-of-the-art bioinspired categorization methods, the proposed algorithm requires less hardware resource, reduces the computation complexity by at least 5 times, and is an ideal candidate for hardware implementation with event-based circuits.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.