Predicting human intention in visual observations of hand/object interactions

Song, Dan; Kyriazis, Nicholas; Oikonomidis, Iason; Papazov, Chavdar; Argyros, Antonis A.; Burschka, Darius; Kragić, Danica

doi:10.1109/icra.2013.6630785

Cited by 47 publications

(30 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…An important application of imitation learning is robotic grasp selection. Song et al [6] instead proposed a robot grasping planning method which aims to learn human intention and mapping to the robotic embodiment on this more abstracted level, instead of directly mapping the grasps from the human to the robotic grasp spaces. However, in their method, a fixed setting of high-level object and action parameters need to be manually specified.…”

Section: Related Workmentioning

confidence: 99%

“…Moreover, as a topic model, LM-LDA represents the data in terms of "topics" in a latent space. Hence, there is a potential to use the learned topics for transfer learning of action "intention" [6] to other robot configurations with different grasping state space.…”

Section: Contributionmentioning

confidence: 99%

“…We employ this approach since it gives an intuitive way of fusing information from different modalities. Furthermore, the topic space can be used for transfer learning, as discussed above [6].…”

Section: Related Workmentioning

confidence: 99%

“…Imitation learning, or learning from demonstration [18], [19], [20], [6], [5], [21] is a challenging problem in robotics. A large amount of context information that used in vision recognition could be well used in robotic imitation learning.…”

Section: Related Workmentioning

confidence: 99%

“…Contextual information has been proven to improve the performance of a number of visual recognition [1], [2], [3], [4] and robotic manipulation [5], [6] tasks. Fig.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Contextual modeling with labeled multi-LDA

Zhang¹,

Song²

2013

2013 IEEE/RSJ International Conference on Intelligent Robots and Systems

View full text Add to dashboard Cite

Abstract-Learning about activities and object affordances from human demonstration are important cognitive capabilities for robots functioning in human environments, for example, being able to classify objects and knowing how to grasp them for different tasks. To achieve such capabilities, we propose a Labeled Multi-modal Latent Dirichlet Allocation (LM-LDA), which is a generative classifier trained with two different data cues, for instance, one cue can be traditional visual observation and another cue can be contextual information. The novel aspects of the LM-LDA classifier, compared to other methods for encoding contextual information are that, I) even with only one of the cues present at execution time, the classification will be better than single cue classification since cue correlations are encoded in the model, II) one of the cues (e.g., common grasps for the observed object class) can be inferred from the other cue (e.g., the appearance of the observed object). This makes the method suitable for robot online and transfer learning; a capability highly desirable in cognitive robotic applications. Our experiments show a clear improvement for classification and a reasonable inference of the missing data.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Contributionmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

“…Contextual information has been proven to improve the performance of a number of visual recognition [1], [2], [3], [4] and robotic manipulation [5], [6] tasks. Fig.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Contextual modeling with labeled multi-LDA

Zhang¹,

Song²

2013

2013 IEEE/RSJ International Conference on Intelligent Robots and Systems

View full text Add to dashboard Cite

show abstract

Multimodal Learning‐Based Proactive Human Handover Intention Prediction Using Wearable Data Gloves and Augmented Reality

Zou,

Liu,

Zhao

et al. 2024

Advanced Intelligent Systems

View full text Add to dashboard Cite

Efficient object handover between humans and robots holds significant importance within collaborative manufacturing environments. Enhancing the efficacy of human–robot handovers involves enabling robots to comprehend and foresee human handover intentions. This article introduces human‐teaching–robot‐learning‐prediction framework, allowing robots to learn from diverse human demonstrations and anticipate human handover intentions. The framework facilitates human programming of robots through demonstrations utilizing augmented reality and a wearable dataglove, aligned with task requirements and human working preferences. Subsequently, robots enhance their cognitive capabilities by assimilating insights from human handover demonstrations, utilizing deep neural network algorithms. Furthermore, robots can proactively seek clarification from humans via an augmented reality system when confronted with ambiguity in human intentions, mirroring how humans seek clarity from their counterparts. This proactive approach empowers robots to anticipate human intentions and assist human partners during handovers. Empirical results underscore the benefits of the proposed approach, demonstrating highly accurate prediction of human intentions in human–robot handover tasks.

show abstract