2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
DOI: 10.1109/cvpr46437.2021.00893
|View full text |Cite
|
Sign up to set email alerts
|

DexYCB: A Benchmark for Capturing Hand Grasping of Objects

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
113
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 170 publications
(113 citation statements)
references
References 34 publications
0
113
0
Order By: Relevance
“…We compare the accuracy of Dexpilot-Single's online optimization with our neural network retargeter that relies on offline optimization during training. We gather a test set of 500 sequences from the DexYCB video dataset [13], which contains videos with annotated ground-truth human hand poses. For each video, at each timestep, the poses are fed to our neural network and Dexpilot-Single with a (generous) time budget of 40ms to solve.…”
Section: A Accuracy Of Retargeter Networkmentioning
confidence: 99%
“…We compare the accuracy of Dexpilot-Single's online optimization with our neural network retargeter that relies on offline optimization during training. We gather a test set of 500 sequences from the DexYCB video dataset [13], which contains videos with annotated ground-truth human hand poses. For each video, at each timestep, the poses are fed to our neural network and Dexpilot-Single with a (generous) time budget of 40ms to solve.…”
Section: A Accuracy Of Retargeter Networkmentioning
confidence: 99%
“…Not surprisingly, hand-object interaction has received much attention. This growth is accelerated by the introduction of datasets that contain both hand and object annotations [1,2,5,9,14,24,39]. Leveraging this data, a large number of methods attempt to estimate grasp parameters, such as the hand and object pose, directly from RGB images [4,10,15,16,22,26,40,42].…”
Section: Related Workmentioning
confidence: 99%
“…The grasping policy's purpose is to establish and maintain a stable grasp, whereas the motion synthesis module generates a motion to move the object to a user-specified target position. To guide the low-level grasping policy, we require a single grasp label corresponding to a static hand pose, which can be obtained either from a hand-grasping dataset [5,14] or from a state-of-the-art grasp synthesis method [19]. Crucially, we propose a reward function that is parameterized by the grasp label to incentivize the fingers to reach contact points on the object, leading to human-like grasps.…”
Section: Introductionmentioning
confidence: 99%
“…Size Mesh UP MV STB [84] real 36K × × × RHD [92] synthetic 44K × × × GANerated Hands [60] synthetic 331K × × × SeqHAND [78] synthetic 410K × × EgoDexter [61] real 3K × × × Dexter+Object [70] real 3K × × × FreiHAND [93] real 134K × × YoutubeHand [46] real 47K × × ObMan [29] synthetic 153K × × HO3D [25] real 77K × × DexYCB [8] real 528K × × H2O [47] synthetic 571K × × FPHA [16] synthetic 105K × × × H3D [87] real 22K × (15) MHP [19] real 80K × × (4) MVHM [9] synthetic 320K × (8) InterHand2.6M [59] real 2.6M × (80) ours synthetic 328K (216) Table 7. Comparison among RGB-based 3D hand datasets.…”
Section: Dataset Typementioning
confidence: 99%
“…Motivation. As shown in Table 7, many datasets are developed for 3D hand pose estimation [84,92,61,70,93,46,19,87,8,47,16,78,60]. To collect real-world hand data, existing datasets are usually captured using a multiview studio and annotated via semi-automatic model fitting [93,19].…”
Section: Synthetic Datasetmentioning
confidence: 99%