Embodied artificial cognitive systems, such as autonomous robots or intelligent observers, connect cognitive processes to sensory and effector systems in real time. Prime candidates for such embodied intelligence are neurally inspired architectures. While components such as forward neural networks are well established, designing pervasively autonomous neural architectures remains a challenge. This includes the problem of tuning the parameters of such architectures so that they deliver specified functionality under variable environmental conditions and retain these functions as the architectures are expanded. The scaling and autonomy problems are solved, in part, by dynamic field theory (DFT), a theoretical framework for the neural grounding of sensorimotor and cognitive processes. In this paper, we address how to efficiently build DFT architectures that control embodied agents and how to tune their parameters so that the desired cognitive functions emerge while such agents are situated in real environments. In DFT architectures, dynamic neural fields or nodes are assigned dynamic regimes, that is, attractor states and their instabilities, from which cognitive function emerges. Tuning thus amounts to determining values of the dynamic parameters for which the components of a DFT architecture are in the specified dynamic regime under the appropriate environmental conditions. The process of tuning is facilitated by the software framework cedar, which provides a graphical interface to build and execute DFT architectures. It enables to change dynamic parameters online and visualize the activation states of any component while the agent is receiving sensory inputs in real time. Using a simple example, we take the reader through the workflow of conceiving of DFT architectures, implementing them on embodied agents, tuning their parameters, and assessing performance while the system is coupled to real sensory inputs.
Handling objects or interacting with a human user about objects on a shared tabletop requires that objects be identified after learning from a small number of views and that object pose be estimated. We present a neurally inspired architecture that learns object instances by storing features extracted from a single view of each object. Input features are color and edge histograms from a localized area that is updated during processing. The system finds the best-matching view for the object in a novel input image while concurrently estimating the object’s pose, aligning the learned view with current input. The system is based on neural dynamics, computationally operating in real time, and can handle dynamic scenes directly off live video input. In a scenario with 30 everyday objects, the system achieves recognition rates of 87.2% from a single training view for each object, while also estimating pose quite precisely. We further demonstrate that the system can track moving objects, and that it can segment the visual array, selecting and recognizing one object while suppressing input from another known object in the immediate vicinity. Evaluation on the COIL-100 dataset, in which objects are depicted from different viewing angles, revealed recognition rates of 91.1% on the first 30 objects, each learned from four training views.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.