The multimodal interface for controlling functions of the complex modular robotic system, which can be deployed in difficult conditions as are rescue works, natural disasters, fires, decontamination purposes was designed. Such interface involves several fundamental technologies such as speech recognition, speech synthesis and dialogue management. To enable human operator to cooperate with designed robotic system, the sophisticated architecture was designed and described technologies were implemented. The automatic speech recognition system is introduced, which is based on Hidden Markov models and enables to control functions of the system using a set of voice commands. The text-to-speech system was prepared for producing feedback to the operator and dialogue manager technology was adopted, which makes it possible to perform the information exchange between operator and robotic system. The system proposed is enriched with acoustic event detection system, which consists of a set of five microphones integrated on the robotic vehicle, the post-processing unit and detection unit.