This paper proposes several speech technology improvements for increasing robustness, reliability and ergonomics in speech interfaces for controlling aerial vehicles. These improvements consist of including a statistical language model for increasing the robustness against spontaneous speech, incorporating confidence measures for evaluating the performance of on-line the speech engines (better reliability), and a flexible response generation for improving the interface ergonomics. This paper includes a detailed description of the speech control interface developed as a result of the collaboration between the GTH (Grupo de Tecnología del Habla or Speech Technology Group) at Universidad Politécnica de Madrid (UPM) and the company Boeing Research and Technology Europe under the contract No. 206/05. This interface includes modules that perform speech recognition, natural language understanding and response generation via a speech synthesizer. In the system evaluation, the final results reported a 96.4% Word Accuracy and a 92.2% Semantic Concept Accuracy. This paper also provides a state-of-art review of using Speech Technology for controlling aerial vehicles, comparing the main initiatives carried out. A significant conclusion of this work is that Speech Technology is now ready enough to be considered as a new modality (in parallel with traditional ones) for introducing high level commands while the controller is carrying out others actions when interacting with these control systems. In critical applications (such as this) the best performance of this technology is achieved when all the configuration possibilities of the speech engines are accessible and the speech interface is designed in collaboration with Speech Technology experts.