This paper describes the design of a single learning network that integrates both object location ("where") and object type ("what"), from images of learned objects in natural complex backgrounds. The in-place learning algorithm is used to develop the internal representation (including synaptic bottomup and top-down weights of every neuron) in the network, such that every neuron is responsible for the learning of its own signal processing characteristics within its connected network environment, through interactions with other neurons in the same layer. In contrast with the previous fully connected MILN [13], the cells in each layer are locally connected in the network. Local analysis is achieved through multi-scale receptive fields, with increasing sizes of perception from earlier to later layers. The results of the experiments showed how one type of information ("where" or "what") assists the network to suppress irrelevant information from background (from "where") or irrelevant object information (from "what"), so as to give the required missing information ("where" or "what") in the motor output.
Abstract-In this report, we propose an object learning system that incorporates sensory information from an automotive radar system and a video camera. The radar system provides a coarse attention for the focus of visual analysis on relatively small areas within the image plane. The attended visual areas are coded and learned by a 3-layer neural network utilizing what is called inplace learning: each neuron is responsible for the learning of its own processing characteristics within the connected network environment, through inhibitory and excitatory connections with other neurons. The modeled bottom-up, lateral, and top-down connections in the network enable sensory sparse coding, unsupervised learning and supervised learning to occur concurrently. The presented work is applied to learn two types of encountered objects in multiple outdoor driving settings. Cross validation results show that the overall recognition accuracy is above 95% for the radar-attended window images. In comparison with the uncoded representation and purely unsupervised learning (without top-down connection), the proposed network improves the overall recognition rate by 15.93% and 6.35%, respectively. The proposed system is also compared with other learning algorithms favorably. The result indicates that our learning system is the only one fit for the incremental and online object learning in the real-time driving environment.Index Terms-Intelligent vehicle system, sensor fusion, object learning, biologically inspired neural network, sparse coding.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.