Biologically Inspired Model for Visual Cognition Achieving Unsupervised Episodic and Semantic Feature Learning

Qiao, Hong; Li, Yinlin; Li, Fengfu; Xi, Xuekui; Wu, Wei

doi:10.1109/tcyb.2015.2476706

Cited by 37 publications

(16 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Recently, a few works have been proposed for incorporating various information into common representation learning, such as semi-supervised and sparse regularizations [5], local group based priori [22], and semantic hierarchy [23]. Inspired by the considerable improvement by DNN in many single-modal tasks such as image classification [15] and object recognition [24], researchers have made great efforts to apply DNN to cross-modal retrieval as [6], [9], [25], [26]. For example, Ngiam et al [9] propose bimodal deep autoencoder, which is an extension of restricted Boltzmann machine (RBM).…”

Section: A Cross-modal Retrievalmentioning

confidence: 99%

MHTN: Modal-Adversarial Hybrid Transfer Network for Cross-Modal Retrieval

Huang

Peng

Yuan

2020

IEEE Trans. Cybern.

108

View full text Add to dashboard Cite

Cross-modal retrieval has drawn wide interest for retrieval across different modalities of data (such as text, image, video, audio and 3D model). However, existing methods based on deep neural network (DNN) often face the challenge of insufficient cross-modal training data, which limits the training effectiveness and easily leads to overfitting. Transfer learning is usually adopted for relieving the problem of insufficient training data, but it mainly focuses on knowledge transfer only from large-scale datasets as single-modal source domain (such as ImageNet) to single-modal target domain. In fact, such large-scale single-modal datasets also contain rich modal-independent semantic knowledge that can be shared across different modalities. Besides, large-scale cross-modal datasets are very labor-consuming to collect and label, so it is significant to fully exploit the knowledge in singlemodal datasets for boosting cross-modal retrieval. To achieve this goal, this paper proposes modal-adversarial hybrid transfer network (MHTN), which to the best of our knowledge is the first work to realize knowledge transfer from single-modal source domain to cross-modal target domain, and learn cross-modal common representation. It is an end-to-end architecture with two subnetworks: (1) Modal-sharing knowledge transfer subnetwork is proposed to jointly transfer knowledge from a large-scale singlemodal dataset in source domain to all modalities in target domain with a star network structure, which distills modal-independent supplementary knowledge for promoting cross-modal common representation learning. (2) Modal-adversarial semantic learning subnetwork is proposed to construct an adversarial training mechanism between common representation generator and modality discriminator, making the common representation discriminative for semantics but indiscriminative for modalities to enhance crossmodal semantic consistency during transfer process. Comprehensive experiments on 4 widely-used datasets show its effectiveness and generality.

show abstract

Section: A Cross-modal Retrievalmentioning

confidence: 99%

MHTN: Modal-Adversarial Hybrid Transfer Network for Cross-Modal Retrieval

Huang

Peng

Yuan

2020

IEEE Trans. Cybern.

108

View full text Add to dashboard Cite

show abstract

“…The low sampling efficiency of reinforcement learning makes training difficult, and a reasonable reward function and network structure need to be designed to achieve better results. Robot [163][164][165][166] Computer vision [167][168][169][170][171] Data analysis [23,172,173] Self-supervised…”

Section: Reinforcement Learningmentioning

confidence: 99%

Object Detection Recognition and Robot Grasping Based on Machine Learning: A Survey

Bai

Yang

et al. 2020

IEEE Access

View full text Add to dashboard Cite

With the rapid development of machine learning, its powerful function in the machine vision field is increasingly reflected. The combination of machine vision and robotics to achieve the same precise and fast grasping as that of humans requires high-precision target detection and recognition, location and reasonable grasp strategy generation, which is the ultimate goal of global researchers and one of the prerequisites for the large-scale application of robots. Traditional machine learning has a long history and good achievements in the field of image processing and robot control. The CNN (convolutional neural network) algorithm realizes training of large-scale image datasets, solves the disadvantages of traditional machine learning in large datasets, and greatly improves accuracy, thereby positioning CNNs as a global research hotspot. However, the increasing difficulty of labeled data acquisition limits their development. Therefore, unsupervised learning, self-supervised learning and reinforcement learning, which are less dependent on labeled data, have also undergone rapid development and achieved good performance in the fields of image processing and robot capture. According to the inherent defects of vision, this paper summarizes the research achievements of tactile feedback in the fields of target recognition and robot grasping and finds that the combination of vision and tactile feedback can improve the success rate and robustness of robot grasping. This paper provides a systematic summary and analysis of the research status of machine vision and tactile feedback in the field of robot grasping and establishes a reasonable reference for future research.

show abstract

“…Therefore, similarly, networks of this type are usually adopted for low-level sensory acquisition in robotic systems, such as vision (Perrinet et al, 2004 ), tactile sensing (Rochel et al, 2002 ), and olfaction (Cassidy and Ekanayake, 2006 ). For example, inspired by the structures and principles of primate visual cortex, Qiao et al ( 2014 , 2015 , 2016 ) enhanced the feed-forward models including Hierarchical Max Pooling (HAMX) model and Convolutional Deep Belief Network (CDBN) with memory, association, active adjustment, semantic and episodic feature learning ability etc., and achieved good results in visual recognition task.…”

Section: Modeling Of Spiking Neural Networkmentioning

confidence: 99%

A Survey of Robotics Control Based on Learning-Inspired Spiking Neural Networks

et al. 2018

View full text Add to dashboard Cite

Biological intelligence processes information using impulses or spikes, which makes those living creatures able to perceive and act in the real world exceptionally well and outperform state-of-the-art robots in almost every aspect of life. To make up the deficit, emerging hardware technologies and software knowledge in the fields of neuroscience, electronics, and computer science have made it possible to design biologically realistic robots controlled by spiking neural networks (SNNs), inspired by the mechanism of brains. However, a comprehensive review on controlling robots based on SNNs is still missing. In this paper, we survey the developments of the past decade in the field of spiking neural networks for control tasks, with particular focus on the fast emerging robotics-related applications. We first highlight the primary impetuses of SNN-based robotics tasks in terms of speed, energy efficiency, and computation capabilities. We then classify those SNN-based robotic applications according to different learning rules and explicate those learning rules with their corresponding robotic applications. We also briefly present some existing platforms that offer an interaction between SNNs and robotics simulations for exploration and exploitation. Finally, we conclude our survey with a forecast of future challenges and some associated potential research topics in terms of controlling robots based on SNNs.

show abstract

Biologically Inspired Model for Visual Cognition Achieving Unsupervised Episodic and Semantic Feature Learning

Cited by 37 publications

References 39 publications

MHTN: Modal-Adversarial Hybrid Transfer Network for Cross-Modal Retrieval

MHTN: Modal-Adversarial Hybrid Transfer Network for Cross-Modal Retrieval

Object Detection Recognition and Robot Grasping Based on Machine Learning: A Survey

A Survey of Robotics Control Based on Learning-Inspired Spiking Neural Networks

Contact Info

Product

Resources

About