Children exchange information through multiple modalities, including verbal communication, gestures and social gaze and they gradually learn to plan their behavior and coordinate successfully with their partners. The development of joint attention and joint action, especially in the context of social play, provides rich opportunities for describing the characteristics of interactions that can lead to shared outcomes. In the present work, we argue that human–robot interactions (HRI) can benefit from these developmental studies, through influencing the human’s perception and interpretation of the robot’s behavior. We thus endeavor to describe some components that could be implemented in the robot to strengthen the feeling of dealing with a social agent, and therefore improve the success of collaborative tasks. Focusing in particular on motor precision, coordination, and anticipatory planning, we discuss the question of complexity in HRI. In the context of joint activities, we highlight the necessity of (1) considering multiple speech acts involving multimodal communication (both verbal and non-verbal signals), and (2) analyzing separately the forms and functions of communication. Finally, we examine some challenges related to robot competencies, such as the issue of language and symbol grounding, which might be tackled by bringing together expertise of researchers in developmental psychology and robotics.