2024
DOI: 10.1109/tnnls.2022.3175720
|View full text |Cite
|
Sign up to set email alerts
|

Simulation-Aided Handover Prediction From Video Using Recurrent Image-to-Motion Networks

Abstract: Recent advances in deep neural networks have opened up new possibilities for visuomotor robot learning. In the context of human-robot or robot-robot collaboration, such networks can be trained to predict future poses and this information can be used to improve the dynamics of cooperative tasks. This is important, both in terms of realizing various cooperative behaviors, and for ensuring safety. In this article, we propose a recurrent neural architecture, capable of transforming variablelength input motion vide… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 49 publications
0
6
0
Order By: Relevance
“…Recently, they extended the usage of GPR to create a database needed to train autoencoder NNs for dimensionality reduction (Lončarević et al, 2021). Mavsar et al (2022) in their work presented a recurrent neural architecture, capable of transforming variable-length input motion videos into a set of parameters describing a robot trajectory which is later encoded with DMP, where predictions can be made after receiving only a few frames, in addition, a simulation environment is utilized to expand the training database and to improve the generalization capability of the network, which is used for handover robotic tasks. Furthermore, Jaques et al (2021) in their study introduced the Newtonian Variational Autoencoder (Newtonian VAE), a framework for learning latent dynamics.…”
Section: Dmps Integration In Complex Frameworkmentioning
confidence: 99%
“…Recently, they extended the usage of GPR to create a database needed to train autoencoder NNs for dimensionality reduction (Lončarević et al, 2021). Mavsar et al (2022) in their work presented a recurrent neural architecture, capable of transforming variable-length input motion videos into a set of parameters describing a robot trajectory which is later encoded with DMP, where predictions can be made after receiving only a few frames, in addition, a simulation environment is utilized to expand the training database and to improve the generalization capability of the network, which is used for handover robotic tasks. Furthermore, Jaques et al (2021) in their study introduced the Newtonian Variational Autoencoder (Newtonian VAE), a framework for learning latent dynamics.…”
Section: Dmps Integration In Complex Frameworkmentioning
confidence: 99%
“…[24][25][26][27][28][29] In juxtaposition, human activity prediction algorithms are honed for forecasting forthcoming human actions and enable automated systems to formulate their course based on historical and current data. They encompass intention prediction, [30][31][32] motion prediction, [33][34][35][36][37][38][39] and attention estimation. [40,41] Beyond collaborative tasks, human activity prediction holds efficacy for surveillance systems and social HRI.…”
Section: Human Activitymentioning
confidence: 99%
“…Body detection MobileNet-SSD; [18] Openpose þ SVM; [19] Bayesian Siamese neural network þ CVAE; [20] YOLO þ Bayesian DNN [21] Human following and autonomous navigation; Visual tracking for autonomous robots tasked with humans and environment interaction; Safe HRI and HRC Face recognition SSD þ FaceNetþKCF; [22] SFPD [23] Human following and autonomous navigation; Simultaneous face and person detection for real-time HRI Human activity Activity recognition Two-stream CNN; [24] 3D LRCN þ 3D CNN þ LSTM; [25] LSTM þ VAE þ DRL; [26] 3D-CNN; [27] STJ-CNN; [28] TCN [29] Collaborative assembly and packaging; Safe HRI and HRC; Companion robots; HRI and VR applications Intention prediction CNN; [30] ILSTM þ IBi-LSTM; [31] CNN þ VMM [32] Surveillance; Collaborative assembly Motion prediction RSSAC-Trajectronþþ; [33] RNN; [34,35] VAE; [36] CVAE þ LSTM; [37] Dynamic motion projection; [38] RNN þ RIMEDNet [39] Safe and efficient HRI and HRC; Collaborative manipulation and assembly; Human imitation; Social HRI; Handover tasks Attention estimation ANN; [40] LSTM [41] Attention level estimation; Blind 3D human attention inference Human pose Body pose recognition OpenPose þ Angle-based rules; [42] Fast-SCNN þ REDE; [43] PoseNet [44] Ergonomics in HRC; Handover task; Efficient and safe HRI and HRC…”
Section: Human Positionmentioning
confidence: 99%
See 1 more Smart Citation
“…In this paper, we propose a method to maximize the utilization of existing training data, consisting of input RGB videos and the corresponding labels. In our previous work, we developed an approach for generating object handover behaviors using recurrent neural networks [5], [6], where videos of the giver's motion are used as input to an LSTM network, which computes the necessary receiver's motion for a successful handover. The proposed network can predict either the handover location [6] or complete receiver trajectories [5].…”
Section: Introductionmentioning
confidence: 99%