2019
DOI: 10.1007/s41095-019-0135-2
|View full text |Cite
|
Sign up to set email alerts
|

Recurrent 3D attentional networks for end-to-end active object recognition

Abstract: Active vision is inherently attention-driven: The agent actively selects views to attend in order to fast achieve the vision task while improving its internal representation of the scene being observed. Inspired by the recent success of attention-based models in 2D vision tasks based on single RGB images, we propose to address the multi-view depthbased active object recognition using attention mechanism, through developing an end-to-end recurrent 3D attentional network. The architecture takes advantage of a re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 10 publications
(2 citation statements)
references
References 22 publications
0
2
0
Order By: Relevance
“…Deep Reinforcement Learning (DRL), combining the powerful approximation ability of Deep Neural Network (DNN) and the excellent decision-making ability of Reinforcement Learning (RL), has made great progress in fields like robot control [9], adversarial games [10], and the foundation model training [11] with the in-depth research in recent years. In the context of the AcTR task, this learning paradigm also suits very well, which can be validated by the extensive studies and applications [12]- [13]. The features of the observed images can be effectively extracted through DNN to construct the state in the Markov decision process (MDP).…”
Section: Introductionmentioning
confidence: 90%
“…Deep Reinforcement Learning (DRL), combining the powerful approximation ability of Deep Neural Network (DNN) and the excellent decision-making ability of Reinforcement Learning (RL), has made great progress in fields like robot control [9], adversarial games [10], and the foundation model training [11] with the in-depth research in recent years. In the context of the AcTR task, this learning paradigm also suits very well, which can be validated by the extensive studies and applications [12]- [13]. The features of the observed images can be effectively extracted through DNN to construct the state in the Markov decision process (MDP).…”
Section: Introductionmentioning
confidence: 90%
“…Song et al [SZX15] propose an information‐theoretic approach based on 3D volumetric deep learning [WSK∗15]. When target objects are unknown , detection and recognition need to be solved simultaneously [LSZ∗19]. Ye et al [YLL∗18] propose navigation policy learning guided by active object detection and recognition.…”
Section: Related Workmentioning
confidence: 99%