Planning precise manipulation in robotics to perform grasp and release-related operations, while interacting with humans is a challenging problem. Reinforcement learning (RL) has the potential to make robots attain this capability. In this paper, we propose an affordance-based human-robot interaction (HRI) framework, aiming to reduce the action space size that would considerably impede the exploration efficiency of the agent. The framework is based on a new algorithm called Contextual Q-learning (CQL). We first show that the proposed algorithm trains in a reduced amount of time (2.7 seconds) and reaches an 84% of success rate. This suits the robot's learning efficiency to observe the current scenario configuration and learn to solve it. Then, we empirically validate the framework for implementation in HRI real-world scenarios. During the HRI, the robot uses semantic information from the state and the optimal policy of the last training step to search for relevant changes in the environment that may trigger the generation of a new policy.
INDEX TERMSQ-learning, Robotics, Affordances, Robot learning, Human-Robot Interaction [14]. Passive observation is when the robot learns from the user through video streams [15]. With teleoperation, the user