We present a convolutional neural network implementation for pixel processor array (PPA) sensors. PPA hardware consists of a fine-grained array of general-purpose processing elements, each capable of light capture, data storage, program execution, and communication with neighboring elements. This allows images to be stored and manipulated directly at the point of light capture, rather than having to transfer images to external processing hardware. Our CNN approach divides this array up into 4x4 blocks of processing elements, essentially trading-off image resolution for increased local memory capacity per 4x4 "pixel". We implement parallel operations for image addition, subtraction and bit-shifting images in this 4x4 block format. Using these components we formulate how to perform ternary weight convolutions upon these images, compactly store results of such convolutions, perform max-pooling, and transfer the resulting sub-sampled data to an attached micro-controller. We train ternary weight filter CNNs for digit recognition and a simple tracking task, and demonstrate inference of these networks upon the SCAMP5 PPA system. This work represents a first step towards embedding neural network processing capability directly onto the focal plane of a sensor.
This paper presents a method of occluding depth edge-detection targeted towards RGB-D video streams and explores the use of these and other edge features in RGB-D SLAM. The proposed depth edge-detection approach uses prior information obtained from the previous RGB-D video frame to determine which areas of the current depth image are likely to contain edges due to image similarity. By limiting the search for edges to these areas a significant amount of computation time is saved compared to searching the entire image. Pixels belonging to both the depth and colour edges of an RGB-D image can be back projected using the depth component to form 3D point clouds of edge points. Registration between such edge point clouds is achieved using ICP and we present a realtime RGB-D SLAM system utilizing such back projected edge features. Experimental results are presented demonstrating the performance of both the proposed depth edge-detection and SLAM system using publicly available datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.