We consider, evaluate, and develop methods for home rehabilitation scenarios. We show the required modules for this scenario. Due to the large number of modules, the framework falls into the category of Composite AI. Our work is based on collected videos with high-quality execution and samples of typical errors. They are augmented by sample dialogues about the exercise to be executed and the assumed errors. We study and discuss body pose estimation technology, dialogue systems of different kinds and the emerging constraints of verbal communication. We demonstrate that the optimization of the camera and the body pose allows high-precision recording and requires the following components: (1) optimization needs a 3D representation of the environment, (2) a navigation dialogue to guide the patient to the optimal pose, (3) semantic and instance maps are necessary for verbal instructions about the navigation. We put forth different communication methods, from video-based presentation to chit-chat-like dialogues through rule-based methods. We discuss the methods for different aspects of the challenges that can improve the performance of the individual components. Due to the emerging solutions, we claim that the range of applications will drastically grow in the very near future.