This paper analyzes work activity in the home, e.g., cleaning, performed by two actors, a human and a robot. Nowadays, there are attempts to automate this activity through the use of robots. However, the activity of cleaning, in and of itself, is not important; it is used instrumentally to understand if and how robots can be integrated within current and future homes. The theoretical framework of the paper is based on empirical work collected as part of the Multimodal Elderly Care Systems (MECS) project. The study proposes a framework for the division of work tasks between humans and robots. The framework is anchored within existing research and our empirical findings. Swim-lane diagrams are used to visualize the tasks performed (WHAT), by each of the two actors, to ascertain the tasks’ temporality (WHEN), and their distribution and transitioning from one actor to the other (WHERE). The study presents the framework of various dimensions of work tasks, such as the types of work tasks, but also the temporality and spatiality of tasks, illustrating linear, parallel, sequential, and distributed tasks in a shared or non-shared space. The study’s contribution lies in its foundation for analyzing work tasks that robots integrated into or used in the home may generate for humans, along with their multimodal interactions. Finally, the framework can be used to visualize, plan, and design work tasks for the human and for the robot, respectively, and their work division.