Interpreting the behaviour of autonomous machines will be a daily activity for future generations. Yet, surprisingly little is currently known about how people ascribe intentions to human-like and non-human-like agents or objects. In a series of six experiments, we compared people’s ability to extract non-mentalistic (i.e., where an agent is looking) and mentalistic (i.e., what an agent is looking at; what an agent is going to do) information from identical gaze and head movements performed by humans, human-like robots, and a non-human-like object. Results showed that people are faster to infer the mental content of human agents compared to robotic agents. Furthermore, the form of the non-human entity may differently engage mentalizing processes depending on how human-like its appearance is. These results are not easily explained by non-mentalizing strategies (e.g., spatial accounts), as we observed no clear differences in control conditions across the three different agents. Overall, results suggest that human-like robotic actions may be processed differently from both humans’ and objects’ behaviour. We discuss the extent to which these findings inform our understanding of the relevance of an agents’ or objects’ physical features in triggering mentalizing abilities and its relevance for human–robot interaction.