Hand parsing for fine-grained recognition of human grasps in monocular images

Saran, Akanksha; Teney, Damien

doi:10.1109/iros.2015.7354088

Cited by 11 publications

(5 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Yang et al [32] utilized a convolutional neural network to classify hand grasp types on unstructured public dataset and presented the usefulness of grasp types for predicting action intention. Saran et al [28] used detected hand parts as intermediate representation to recognize fine-grained grasp types. However, the recognition performance is still not good enough for practical usage in real-world environments.…”

Section: A Related Workmentioning

confidence: 99%

Understanding Hand-Object Manipulation with Grasp Types and Object Attributes

Cai

Kitani

Sato

Robotics: Science and Systems XII

View full text Add to dashboard Cite

Abstract-Our goal is to automate the understanding of natural hand-object manipulation by developing computer visionbased techniques. Our hypothesis is that it is necessary to model the grasp types of hands and the attributes of manipulated objects in order to accurately recognize manipulation actions. Specifically, we focus on recognizing hand grasp types, object attributes and actions from a single image within an unified model. First, we explore the contextual relationship between grasp types and object attributes, and show how that context can be used to boost the recognition of both grasp types and object attributes. Second, we propose to model actions with grasp types and object attributes based on the hypothesis that grasp types and object attributes contain complementary information for characterizing different actions. Our proposed action model outperforms traditional appearance-based models which are not designed to take into account semantic constraints such as grasp types or object attributes. Experiment results on public egocentric activities datasets strongly support our hypothesis.

show abstract

Section: A Related Workmentioning

confidence: 99%

Understanding Hand-Object Manipulation with Grasp Types and Object Attributes

Cai

Kitani

Sato

Robotics: Science and Systems XII

View full text Add to dashboard Cite

show abstract

“…One stream of works in this direction focuses on a hand alone, e.g. hand pose inferring and RGB or RGBD data configuration to either control the hand [13][14][15] or infer manipulation behavior [16][17][18]. Another line of study incorporates the concept that an object's structure determines the hand pose and investigates the hand along with the object [19][20][21].…”

Section: A Hand-object Interactionmentioning

confidence: 99%

Graph-Based Hand-Object Meshes and Poses Reconstruction With Multi-Modal Input

et al. 2021

View full text Add to dashboard Cite

Estimating the hand-object meshes and poses is a challenging computer vision problem with many practical applications. In this paper, we introduce a simple yet efficient hand-object reconstruction algorithm. To this end, we exploit the fact that both the poses and the meshes are graphs-based representations of the hand-object with different levels of details. This allows taking advantage of the powerful Graph Convolution networks (GCNs) to build a coarse-to-fine Graph-based hand-object reconstruction algorithm. Thus, we start by estimating a coarse graph that represents the 2D hand-object poses. Then, more details (e.g. third dimension and mesh vertices) are gradually added to the graph until it represents the dense 3D hand-object meshes. This paper also explores the problem of representing the RGBD input in different modalities (e.g. voxelized RGBD). Hence, we adopted a multi-modal representation of the input by combining 3D representation (i.e. voxelized RGBD) and 2D representation (i.e. RGB only). We include intensive experimental evaluations that measure the ability of our simple algorithm to achieve state-of-theart accuracy on the most challenging datasets (i.e. HO-3D and FPHAB). INDEX TERMSHand pose estimation, hand shape estimation, hand-object interaction, graph convolution, machine learning.

show abstract

“…A typical problem setting involving first-person vision is to recognize activities of camera wearers. Recently, some work has focused on activity recognition [7,22,23,28], activity forecasting [6,9,26,31], person identification [11], gaze anticipation [45] and grasp recognition [3,4,21,35]. Similar to our setting, other work has also tried to recognize behaviors of other people observed in first-person videos, e.g., group discovery [2], eye contact detection [42] and activity recognition [33,34,44].…”

Section: Related Workmentioning

confidence: 99%

Future Person Localization in First-Person Videos

Yagi

Mangalam

Yonetani

et al. 2018

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition

173

139

View full text Add to dashboard Cite

We present a new task that predicts future locations of people observed in first-person videos. Consider a firstperson video stream continuously recorded by a wearable camera. Given a short clip of a person that is extracted from the complete stream, we aim to predict that person's location in future frames. To facilitate this future person localization ability, we make the following three key observations: a) First-person videos typically involve significant ego-motion which greatly affects the location of the target person in future frames; b) Scales of the target person act as a salient cue to estimate a perspective effect in first-person videos; c) First-person videos often capture people up-close, making it easier to leverage target poses (e.g., where they look) for predicting their future locations. We incorporate these three observations into a prediction framework with a multi-stream convolution-deconvolution architecture. Experimental results reveal our method to be effective on our new dataset as well as on a public social interaction dataset.

show abstract

Hand parsing for fine-grained recognition of human grasps in monocular images

Cited by 11 publications

References 38 publications

Understanding Hand-Object Manipulation with Grasp Types and Object Attributes

Understanding Hand-Object Manipulation with Grasp Types and Object Attributes

Graph-Based Hand-Object Meshes and Poses Reconstruction With Multi-Modal Input

Future Person Localization in First-Person Videos

Contact Info

Product

Resources

About