For greater efficiency, human–machine and human–robot interactions must be designed with the idea of multimodality in mind. To allow the use of several interaction modalities, such as the use of voice, touch, gaze tracking, on several different devices (computer, smartphone, tablets, etc.) and to integrate possible connected objects, it is necessary to have an effective and secure means of communication between the different parts of the system. This is even more important with the use of a collaborative robot (cobot) sharing the same space and very close to the human during their tasks. This study present research work in the field of multimodal interaction for a cobot using the MQTT protocol, in virtual (Webots) and real worlds (ESP microcontrollers, Arduino, IOT2040). We show how MQTT can be used efficiently, with a common publish/subscribe mechanism for several entities of the system, in order to interact with connected objects (like LEDs and conveyor belts), robotic arms (like the Ned Niryo), or mobile robots. We compare the use of MQTT with that of the Firebase Realtime Database used in several of our previous research works. We show how a “pick–wait–choose–and place” task can be carried out jointly by a cobot and a human, and what this implies in terms of communication and ergonomic rules, via health or industrial concerns (people with disabilities, and teleoperation).