2015 IEEE International Conference on Computer Vision (ICCV) 2015
DOI: 10.1109/iccv.2015.443
|View full text |Cite
|
Sign up to set email alerts
|

Understanding Everyday Hands in Action from RGB-D Images

Abstract: International audienceWe analyze functional manipulations of handheld objects, formalizing the problem as one of fine-grained grasp classification. To do so, we make use of a recently developed fine-grained taxonomy of human-object grasps. We introduce a large dataset of 12000 RGB-D images covering 71 everyday grasps in natural interactions. Our dataset is different from past work (typically addressed from a robotics perspective) in terms of its scale, diversity, and combination of RGB and depth data. From a c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
107
0
1

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 138 publications
(114 citation statements)
references
References 36 publications
1
107
0
1
Order By: Relevance
“…In this section, we present experiments and results that test the following hypotheses: 1) Using a combination of real and synthetic data to train our CNN for hand-object estimation yields more accurate results in terms of relative poses than reconstructing the handobject pairs. We test this through a qualitative comparison of ours and a state-of-the art method for hand-object reconstruction [15] on the challenging real-world dataset GUN-71 [24] (Sec. VI-A).…”
Section: Methodsmentioning
confidence: 99%
“…In this section, we present experiments and results that test the following hypotheses: 1) Using a combination of real and synthetic data to train our CNN for hand-object estimation yields more accurate results in terms of relative poses than reconstructing the handobject pairs. We test this through a qualitative comparison of ours and a state-of-the art method for hand-object reconstruction [15] on the challenging real-world dataset GUN-71 [24] (Sec. VI-A).…”
Section: Methodsmentioning
confidence: 99%
“…Some authors used the depth information to perform a background/foreground segmentation followed by hand/object segmentation within the foreground region by using appearance information [67], [68], [69]. Wan et al [67] used a time-of-flight (ToF) camera to capture the scene during hand-object interactions.…”
Section: D Segmentationmentioning
confidence: 99%
“…Thus, after thresholding the histogram of depth values to isolate the foreground, hand pixels were detected by combining color (RGB thresholds) and texture (Gabor filters) features. The same ToF camera (Creative Senz3D TM ) was used by Rogez et al [68]. The authors trained a multi-class classifier on synthetic depth images of 1,500 different hand poses, in order to recognize one of these poses in the test depth images, thus producing a coarse segmentation mask.…”
Section: D Segmentationmentioning
confidence: 99%
“…Related to this purpose, inexpensive RGB-D camera has greatly enhanced the efficiency of visual sensing. Many state-of-the-art works have explored recognition of human daily activities captured by egocentric vision [7,3,2], most of which are focusing on daily activities such as opening the coffee jar or grasp a mug when preparing coffee or making a cake in kitchen scene [7]. These activities tend to interact with a variety of objects in common living scenes with a restricted manipulating complexity.…”
Section: Introductionmentioning
confidence: 99%
“…One major approach to analyze these egocentric experiences is through hands-objects interactions (HOI). Rogez et al [3] build an RGB-D egocentric dataset in describing fine-grained gasps, which is suggestive of hand pose, hand-object contacting points and contact force vectors greatly contribute to understanding HOI activities. Touch points are certainly strong clues to things been interacted and activities using them.…”
Section: Introductionmentioning
confidence: 99%