“…During the last two decades, research in the cognitive sciences fields, in particular embodied social cognition and cognitive neuroscience, has witnessed profound advancement in elucidating the underlying mechanisms of the recognition of actions and intentions in social interactions between humans that is fundamental to mutual interaction between humans, in which the mirror neuron system plays a significant role (e.g., [ 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 ]). As pointed out by Vernon, Thill, and Ziemke [ 26 ], among others, there is a huge challenge to accomplish analogous mutual as well as fluent action and intention recognition between humans and robots as in social human-human interaction [ 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 ], due to the apparent differences between the fundamental biological mechanisms in living human beings compared to the technological ones currently used in robots [ 17 , 19 , 25 , 26 , 27 , 28 , 29 ]. From a more embodied social cognition perspective, there is a major difference in how living agents enact a social world based on the underlying sensori-motor processes, particularly the mechanisms of the mirror neuron system, compared to the electrical wirings and technological implementation of artificial cognitive agents like robots [ 25 , 26 , 27 , …”