The concept of action as basic motor control unit for goal-directed movement behavior has been used primarily for private or non-communicative actions like walking, reaching, or grasping. In this paper literature is reviewed indicating that this concept can also be used in all domains of face-to-face communication like speech, co-verbal facial expression, and coverbal gesturing. Three domain-specific types of actions, i.e. speech actions, facial actions and hand-arm actions are defined in this paper and a model is proposed that elucidates the underlying biological mechanisms of action production, action perception, and action acquisition in all domains of face-to-face communication. This model can be used as theoretical framework for empirical analysis or simulation with embodied conversational agents, and thus for advanced human-computer interaction technologies.