Implementing controls in the car becomes a major challenge: The use of simple physical buttons does not scale to the increased number of assistive, comfort, and infotainment functions. Current solutions include hierarchical menus and multi-functional control devices, which increase complexity and visual demand. Another option is speech control, which is not widely accepted, as it does not support visibility of actions, fine-grained feedback, and easy undo of actions. Our approach combines speech and gestures. By using speech for identification of functions, we exploit the visibility of objects in the car (e.g., mirror) and simple access to a wide range of functions equaling a very broad menu. Using gestures for manipulation (e.g., left/right), we provide fine-grained control with immediate feedback and easy undo of actions. In a user-centered process, we determined a set of user-defined gestures as well as common voice commands. For a prototype, we linked this to a car interior and driving simulator. In a study with 16 participants, we explored the impact of this form of multimodal interaction on the driving performance against a baseline using physical buttons. The results indicate that the use of speech and gesture is slower than using buttons but results in a similar driving performance. Users comment in a DALI questionnaire that the visual demand is lower when using speech and gestures.