Visual object-action recognition: Inferring object affordances from human demonstration

Kjellström, Hedvig; Romero, Javier; Kragić, Danica

doi:10.1016/j.cviu.2010.08.002

Cited by 207 publications

(152 citation statements)

References 36 publications

Supporting

Mentioning

151

Contrasting

Order By: Relevance

Section: Introductionmentioning

confidence: 99%

“…A similar idea was also exploited in [11], [12], [33]. In [12] the recognition module was based on a discriminative model that could not be used to generate grasping actions in the imitation phase. [11] addressed grasp imitation on a robot platform but the approach was based on manual mapping between human and robot hands, thus difficult to generalize across different embodiments.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Predicting human intention in visual observations of hand/object interactions

Song

Kyriazis

Oikonomidis

et al. 2013

2013 IEEE International Conference on Robotics and Automation

View full text Add to dashboard Cite

Abstract-The main contribution of this paper is a probabilistic method for predicting human manipulation intention from image sequences of human-object interaction. Predicting intention amounts to inferring the imminent manipulation task when human hand is observed to have stably grasped the object. Inference is performed by means of a probabilistic graphical model that encodes object grasping tasks over the 3D state of the observed scene. The 3D state is extracted from RGB-D image sequences by a novel vision-based, markerless hand-object 3D tracking framework. To deal with the high-dimensional statespace and mixed data types (discrete and continuous) involved in grasping tasks, we introduce a generative vector quantization method using mixture models and self-organizing maps. This yields a compact model for encoding of grasping actions, able of handling uncertain and partial sensory data. Experimentation showed that the model trained on simulated data can provide a potent basis for accurate goal-inference with partial and noisy observations of actual real-world demonstrations. We also show a grasp selection process, guided by the inferred human intention, to illustrate the use of the system for goal-directed grasp imitation.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Predicting human intention in visual observations of hand/object interactions

Song

Kyriazis

Oikonomidis

et al. 2013

2013 IEEE International Conference on Robotics and Automation

View full text Add to dashboard Cite

show abstract

“…Their model discovered the connectivity and spatial relationships between the objects and body parts. Different to our work, Kjellstrom et al [6] assumed that objects and actions of interests are already categorized. Then the relations are inferred from video data and represented as pairs between action and object classes like "drink-cup".…”

Section: Introductionmentioning

confidence: 99%

Determining Interacting Objects in Human-Centric Activities via Qualitative Spatio-Temporal Reasoning

Sokeh

Gould

Renz

2015

Computer Vision -- ACCV 2014

View full text Add to dashboard Cite

Abstract. Understanding the activities taking place in a video is a challenging problem in Artificial Intelligence. Complex video sequences contain many activities and involve a multitude of interacting objects. Determining which objects are relevant to a particular activity is the first step in understanding the activity. Indeed many objects in the scene are irrelevant to the main activity taking place. In this work, we consider human-centric activities and look to identify which objects in the scene are involved in the activity. We take an activity-agnostic approach and rank every moving object in the scene with how likely it is to be involved in the activity. We use a comprehensive spatio-temporal representation that captures the joint movement between humans and each object. We then use supervised machine learning techniques to recognize relevant objects based on these features. Our approach is tested on the challenging Mind's Eye dataset.

show abstract

“…Some works predict affordance-based or function-based object attributes. For example, [19] consider newspapers and books as readable and books and hammers as hammerable. Such interpretation was also used in several other works [25,36,7].…”

Section: Related Workmentioning

confidence: 99%

Physically Grounded Spatio-temporal Object Affordances

Koppula

Saxena

2014

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. Objects in human environments support various functionalities which govern how people interact with their environments in order to perform tasks. In this work, we discuss how to represent and learn a functional understanding of an environment in terms of object affordances. Such an understanding is useful for many applications such as activity detection and assistive robotics. Starting with a semantic notion of affordances, we present a generative model that takes a given environment and human intention into account, and grounds the affordances in the form of spatial locations on the object and temporal trajectories in the 3D environment. The probabilistic model also allows uncertainties and variations in the grounded affordances. We apply our approach on RGB-D videos from Cornell Activity Dataset, where we first show that we can successfully ground the affordances, and we then show that learning such affordances improves performance in the labeling tasks.

show abstract

Visual object-action recognition: Inferring object affordances from human demonstration

Cited by 207 publications

References 36 publications

Predicting human intention in visual observations of hand/object interactions

Predicting human intention in visual observations of hand/object interactions

Determining Interacting Objects in Human-Centric Activities via Qualitative Spatio-Temporal Reasoning

Physically Grounded Spatio-temporal Object Affordances

Contact Info

Product

Resources

About