Perception Coordination Network: A Neuro Framework for Multimodal Concept Acquisition and Binding

Xing, Youlu; Shi, Xiaofeng; Shen, Furao; Zhao, Jinxi; Pan, Jing-Xin; Tan, Ah-Hwee

doi:10.1109/tnnls.2018.2861680

Cited by 5 publications

(7 citation statements)

References 46 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We present our experimental process and results on two datasets. One dataset contains 20 common fruits and foods that have been previously used by PCN [39]. Another dataset consists of 16 color geometries.…”

Section: Experiments and Resultsmentioning

confidence: 99%

“…Whereas, STAR-SOINN [37] and M-SOINN [38] only are concerned about between-class distance. Recently, PCN [39] was put forward for online multimodal concept acquisition and binding. However, it relies on much human guidance to help make judgments.…”

Section: (B) the Process Mainly Involves Audiovisual Integration Andmentioning

confidence: 99%

“…In order to evaluate the effectiveness of the proposed dynamic threshold strategy in DT-SOINN and the bidirectional cognitive process, we compare our cognitive development results with unidirectional self-organizing learning architecture GWR [34] and recently reported results using PCN [39]. For a consistent comparison with these two methods, we adopt similar evaluation schemes and conduct the experiment in closed and open-ended environments as Xing et al [39] described. At the same time, the learning process is in an open-ended manner, so new examples can be introduced at any arbitrary point in time [55].…”

Section: A Experimental Evaluation Protocolmentioning

confidence: 99%

“…The evaluation metrics widely adopted by most of methods based on self-organizing neural network include node numbers and recall rate [36], [38], [39]. Node numbers can reflect the complexity of networks and recall rate indicates the learning effectiveness.…”

Section: Experiments In Both Closed and Open-ended Environment Are Comentioning

confidence: 99%

See 3 more Smart Citations

An Autonomous Developmental Cognitive Architecture Based on Incremental Associative Neural Network With Dynamic Audiovisual Fusion

Huang

Song

et al. 2019

IEEE Access

View full text Add to dashboard Cite

Developing cognition is difficult to achieve yet crucial for robots. Infants can gradually improve their cognition through parental guidance and self-exploration. However, conventional learning methods for robots often focus on a single modality and train a pre-defined model by large datasets in an offline way. In this paper, we propose a hierarchical autonomous cognitive architecture for robots to learn object concepts online by interacting with humans. Two pathways for audiovisual information are devised. Each pathway has three layers based on the self-organizing incremental neural networks. Visual features and names of objects are incrementally learned and self-organized in an unsupervised way in sample layers, respectively, in which we propose a dynamically adjustable similarity threshold strategy to allow the network itself to control cluster rather than using a pre-defined threshold. Two symbol layers abstract the cluster results from the corresponding sample layer to form concise symbols and transmit them to an associative layer. An associative relationship between two modalities can be built in real time by binding activated visual and auditory symbols simultaneously in the associative layer. In this layer, a top-down response strategy is proposed to let robots autonomously recall another associative modality, solve conflicting associative relationships, and adjust learned knowledge from the top down. The experimental results on two objects datasets and a real task show that our architecture is efficient to learn and associate object view and name in an online way. What is more, the robot can autonomously improve its cognitive level by utilizing its own experience without enquiring with humans. INDEX TERMS Cognitive development, concept online learning, self-organizing incremental neural network, object recognition, audiovisual integration.

show abstract

Section: Experiments and Resultsmentioning

confidence: 99%

Section: (B) the Process Mainly Involves Audiovisual Integration Andmentioning

confidence: 99%

Section: A Experimental Evaluation Protocolmentioning

confidence: 99%

Section: Experiments In Both Closed and Open-ended Environment Are Comentioning

confidence: 99%

See 2 more Smart Citations

An Autonomous Developmental Cognitive Architecture Based on Incremental Associative Neural Network With Dynamic Audiovisual Fusion

Huang

Song

et al. 2019

IEEE Access

View full text Add to dashboard Cite

show abstract

“…Other works with deep learning focus on incrementally bind multimodal attributes to the objects like Xing et al [41]. Here, the Perception Coordination Network online adquires and bind multimodal concepts between different sensory modules.…”

Section: Incremental Learningmentioning

confidence: 99%

Incremental Learning of Object Models From Natural Human–Robot Interactions

Azagra

Civera

Murillo

2020

IEEE Trans. Automat. Sci. Eng.

View full text Add to dashboard Cite

In order to perform complex tasks in realistic human environments, robots need to be able to learn new concepts in the wild, incrementally, and through their interactions with humans. This paper presents an end-to-end pipeline to learn object models incrementally during the human-robot interaction.The pipeline we propose consists of three parts: (a) recognizing the interaction type, (b) detecting the object that the interaction is targeting, and (c) learning incrementally the models from data recorded by the robot sensors. Our main contributions lie in the target object detection, guided by the recognized interaction, and in the incremental object learning. The novelty of our approach is the focus on natural, heterogeneous and multimodal human-robot interactions to incrementally learn new object models. Throughout the paper we highlight the main challenges associated with this problem, such as high degree of occlusion and clutter, domain change, low resolution data and interaction ambiguity. Our work shows the benefits of using multi-view approaches and combining visual and language features, and our experimental results outperform standard baselines.Note to Practitioners-This work was motivated by challenges in recognition tasks for dynamic and varying scenarios. Our approach learns to recognize new user interactions and objects. To do so, we use multimodal data from the user-robot interaction: visual data is used to learn the objects and speech is used to learn the label and help with the interaction type recognition. We use state-of-the-art deep learning models to segment the user and the objects in the scene. Our algorithm for incremental learning is based on a classic incremental clustering approach.The pipeline we propose works with all sensors mounted on the robot, so it allows mobility on the system. Our work uses data recorded from a Baxter robot, which enables the use of the manipulation arms in future steps, but it would work with any robot able to have the same sensors mounted. The sensors used are two RGB-D cameras and a microphone. The pipeline currently has high computational requirements to run the two deep learning based steps. We have tested it with a desktop computer including a GTX 1060 and 32GB of RAM.

show abstract

Bootstrapping Concept Formation in Small Neural Networks

Tamošiūnaitė¹,

Kulvičius²,

Wörgötter³

2023

IEEE Trans. Cogn. Dev. Syst.

View full text Add to dashboard Cite

The question how neural systems (of humans) can perform reasoning is still far from being solved. We posit that the process of forming Concepts is a fundamental step required for this. We argue that, first, Concepts are formed as closed representations, which are then consolidated by relating them to each other. Here we present a model system (agent) with a small neural network that uses realistic learning rules and receives only feedback from the environment in which the agent performs virtual actions. First, the actions of the agent are reflexive. In the process of learning, statistical regularities in the input lead to the formation of neuronal pools representing relations between the entities observed by the agent from its artificial world. This information then influences the behavior of the agent via feedback connections replacing the initial reflex by an action driven by these relational representations. We hypothesize that the neuronal pools representing relational information can be considered as primordial Concepts, which may in a similar way be present in some pre-linguistic animals, too. This system provides formal grounds for further discussions on what could be understood as a Concept and shows that associative learning is enough to develop concept-like structures.

show abstract

Perception Coordination Network: A Neuro Framework for Multimodal Concept Acquisition and Binding

Cited by 5 publications

References 46 publications

An Autonomous Developmental Cognitive Architecture Based on Incremental Associative Neural Network With Dynamic Audiovisual Fusion

An Autonomous Developmental Cognitive Architecture Based on Incremental Associative Neural Network With Dynamic Audiovisual Fusion

Incremental Learning of Object Models From Natural Human–Robot Interactions

Bootstrapping Concept Formation in Small Neural Networks

Contact Info

Product

Resources

About