Relying on one technology with a single interaction modality may benefit some users but would certainly exclude a lot more if they have impedances to use that modality. The solution then becomes the inclusion of multiple modalities in the initial design of the interactive system making it more adaptable to the needs of many more users. Including many modalities can rapidly increase the number of interaction objects that need to receive the stream of user commands. This is especially true if the user needs to interact with multiple artifacts in a home automation environment. In this paper, we present the general architecture of an ongoing project for multimodal home automation system. This system relies on a web based database called Firebase for the exchange of user input and the issuing of commands to the multiple artifacts. The user input is acquired using a smartphone and a webcam equipped computer. They capture the user's tactile input, vocal phrases, eye gaze as well as head pose features like tilt and face direction. We were able to achieve a reliable data transfer between the database and the different input acquisition interface. As a first step in the prototyping of the system, we were able to control two separate game interfaces developed using Unity3D software.