3D Human Pose Estimation in RGBD Images for Robotic Task Learning

Zimmermann, Christian; Welschehold, Tim; Dornhege, Christian; Burgard, Wolfram; Brox, Thomas

doi:10.1109/icra.2018.8462833

Cited by 137 publications

(70 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Usually, the subject's silhouette is captured from a single or multiple angles using a number of vision or RGB-depth cameras. 23,24 A voxel representation of the body is extracted over time, while animation is achieved by fitting a skeleton into the 3D model; see other works for example. [25][26][27][28][29] These approaches can be broadly classified into discriminative, generative, and hybrid approaches.…”

Section: Related Workmentioning

confidence: 99%

Real‐time 3D human pose and motion reconstruction from monocular RGB videos

Yiannakides

Aristidou

Chrysanthou

2019

Computer Animation & Virtual

View full text Add to dashboard Cite

Real‐time three‐dimensional (3D) pose estimation is of high interest in interactive applications, virtual reality, activity recognition, and most importantly, in the growing gaming industry. In this work, we present a method that captures and reconstructs the 3D skeletal pose and motion articulation of multiple characters using a monocular RGB camera. Our method deals with this challenging, but useful, task by taking advantage of the recent development in deep learning that allows two‐dimensional (2D) pose estimation of multiple characters and the increasing availability of motion capture data. We fit 2D estimated poses, extracted from a single camera via OpenPose, with a 2D multiview joint projections database that is associated with their 3D motion representations. We then retrieve the 3D body pose of the tracked character, ensuring throughout that the reconstructed movements are natural, satisfy the model constraints, are within a feasible set, and are temporally smooth without jitters. We demonstrate the performance of our method in several examples, including human locomotion, simultaneously capturing of multiple characters, and motion reconstruction from different camera views.

show abstract

Section: Related Workmentioning

confidence: 99%

Real‐time 3D human pose and motion reconstruction from monocular RGB videos

Yiannakides

Aristidou

Chrysanthou

2019

Computer Animation & Virtual

View full text Add to dashboard Cite

show abstract

“…In [36], an end-to-end 3D pose estimator from RGB-D data is proposed, which alleviates these problems and performs better than methods which operate solely on color images. The Open Pose library [31]- [33] is used to detect 2D keypoint score maps from the color image.…”

Section: A Child Pose Estimationmentioning

confidence: 99%

A Deep Learning Approach for Multi-View Engagement Estimation of Children in a Child-Robot Joint Attention Task

Hadfield

Chalvatzaki

Koutras

et al. 2019

2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

View full text Add to dashboard Cite

In this work we tackle the problem of child engagement estimation while children freely interact with a robot in their room. We propose a deep-based multi-view solution that takes advantage of recent developments in human pose detection. We extract the child's pose from different RGB-D cameras placed elegantly in the room, fuse the results and feed them to a deep neural network trained for classifying engagement levels. The deep network contains a recurrent layer, in order to exploit the rich temporal information contained in the pose data. The resulting method outperforms a number of baseline classifiers, and provides a promising tool for better automatic understanding of a child's attitude, interest and attention while cooperating with a robot. The goal is to integrate this model in next generation social robots as an attention monitoring tool during various CRI tasks both for Typically Developed (TD) children and children affected by autism (ASD).

show abstract

“…In this section, we present the experimental evaluation of our proposed approach in both simulation and with a real robot. To record teacher demonstrations, we rely on [25] to track the hand of the teacher using RGB-D images, and on Simtrack [26] to detect the objects in the scene and estimate their 6-dof poses. We segment the demonstrations automatically based on which object is being manipulated as described in Sec.…”

Section: Experimental Evaluationmentioning

confidence: 99%

Combined Task and Action Learning from Human Demonstrations for Mobile Manipulation Applications

Welschehold

Abdo

Dornhege

et al. 2019

2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Self Cite

View full text Add to dashboard Cite

Learning from demonstrations is a promising paradigm for transferring knowledge to robots. However, learning mobile manipulation tasks directly from a human teacher is a complex problem as it requires learning models of both the overall task goal and of the underlying actions. Additionally, learning from a small number of demonstrations often introduces ambiguity with respect to the intention of the teacher, making it challenging to commit to one model for generalizing the task to new settings. In this paper, we present an approach to learning flexible mobile manipulation action models and task goal representations from teacher demonstrations. Our action models enable the robot to consider different likely outcomes of each action and to generate feasible trajectories for achieving them. Accordingly, we leverage a probabilistic framework based on Monte Carlo tree search to compute sequences of feasible actions imitating the teacher intention in new settings without requiring the teacher to specify an explicit goal state. We demonstrate the effectiveness of our approach in complex tasks carried out in real-world settings.

show abstract

3D Human Pose Estimation in RGBD Images for Robotic Task Learning

Cited by 137 publications

References 24 publications

Real‐time 3D human pose and motion reconstruction from monocular RGB videos

Real‐time 3D human pose and motion reconstruction from monocular RGB videos

A Deep Learning Approach for Multi-View Engagement Estimation of Children in a Child-Robot Joint Attention Task

Combined Task and Action Learning from Human Demonstrations for Mobile Manipulation Applications

Contact Info

Product

Resources

About