Model-based Hand Pose Estimation for Generalized Hand Shape with Appearance Normalization

Wöhlke, Jan; Li, Shile; Lee, Dongheui

doi:10.48550/arxiv.1807.00898

Cited by 2 publications

(3 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Much of the progress made in hand pose estimation have focused on using depth image inputs [5,6,7,8,11,14,15,18,19,32,33,35]. State-of-the-art methods use a convolutional neural network (CNN) architecture, with the majority of works treating the depth input as 2D pixels, though a few more recent approaches treat depth inputs as a set of 3D points and or voxels [7,5,15].…”

Section: Hand Pose Estimationmentioning

confidence: 99%

Disentangling Latent Hands for Image Synthesis and Pose Estimation

Yang

Yao

2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

125

View full text Add to dashboard Cite

Hand image synthesis and pose estimation from RGB images are both highly challenging tasks due to the large discrepancy between factors of variation ranging from image background content to camera viewpoint. To better analyze these factors of variation, we propose the use of disentangled representations and a disentangled variational autoencoder (dVAE) that allows for specific sampling and inference of these factors. The derived objective from the variational lower bound as well as the proposed training strategy are highly flexible, allowing us to handle crossmodal encoders and decoders as well as semi-supervised learning scenarios. Experiments show that our dVAE can synthesize highly realistic images of the hand specifiable by both pose and image background content and also estimate 3D hand poses from RGB images with accuracy competitive with state-of-the-art on two public benchmarks.

show abstract

Section: Hand Pose Estimationmentioning

confidence: 99%

Disentangling Latent Hands for Image Synthesis and Pose Estimation

Yang

Yao

2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

125

View full text Add to dashboard Cite

show abstract

“…Stereo input was similarly explored [24,40,47]. With the wide availability of commodity depth sensors around 2010, the research also focused on monocular depth or RGBD input [10,20,29,31,36,37,43,55,56,57,61]. However, shortly thereafter, the success of deep learning lead to the proliferation of robust systems that perform on monocular RGB input [3,4,9,11,13,17,22,34,41,48,53,54,58,62,63].…”

Section: Literature Overviewmentioning

confidence: 99%

“…The accuracy achieved with the availability of depth information yielded the first practical real-time systems [37], and enabled on-the-fly adjustment of the bone lengths to the observed hand [29,59]. The development and availability of a parametric hand pose and shape model [49] in turn fueled research towards estimating the surface of the observed hand [2,11,22,32,61]. Finally, some very recent works are presenting the first attempts to model the appearance of the hand to some extent [5,32,44].…”

Section: Literature Overviewmentioning

confidence: 99%

Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing

Karvounas¹,

Kyriazis²,

Oikonomidis³

et al. 2021

Preprint

View full text Add to dashboard Cite

The amount and quality of datasets and tools available in the research field of hand pose and shape estimation act as evidence to the significant progress that has been made. We find that there is still room for improvement in both fronts, and even beyond. Even the datasets of the highest quality, reported to date, have shortcomings in annotation. There are tools in the literature that can assist in that direction and yet they have not been considered, so far. To demonstrate how these gaps can be bridged, we employ such a publicly available, multi-camera dataset of hands (InterHand2.6M), and perform effective image-based refinement to improve on the imperfect ground truth annotations, yielding a better dataset. The image-based refinement is achieved through differentiable ray tracing, a method that has not been employed so far to relevant problems and is hereby shown to be superior to the approximative alternatives that have been employed in the past. To tackle the lack of reliable ground truth, we resort to realistic synthetic data, to show that the improvement we induce is indeed significant, qualitatively, and quantitatively, too.

show abstract

Model-based Hand Pose Estimation for Generalized Hand Shape with Appearance Normalization

Cited by 2 publications

References 26 publications

Disentangling Latent Hands for Image Synthesis and Pose Estimation

Disentangling Latent Hands for Image Synthesis and Pose Estimation

Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing

Contact Info

Product

Resources

About