Natural user interfaces (NUI) have been used to reduce driver distraction while using in-vehicle infotainment systems (IVIS), and multimodal interfaces have been applied to compensate for the shortcomings of a single modality in NUIs. These multimodal NUIs have variable effects on different types of driver distraction and on different stages of drivers' secondary tasks. However, current studies provide a limited understanding of NUIs. The design of multimodal NUIs is typically based on evaluation of the strengths of a single modality. Furthermore, studies of multimodal NUIs are not based on equivalent comparison conditions. To address this gap, we compared five single modalities commonly used for NUIs (touch, mid-air gesture, speech, gaze, and physical buttons located in a steering wheel) during a lane change task (LCT) to provide a more holistic view of driver distraction. Our findings suggest that the best approach is a combined cascaded multimodal interface that accounts for the characteristics of a single modality. We compared several combinations of cascaded multimodalities by considering the characteristics of each modality in the sequential phase of the command input process. Our results show that the combinations speech + button, speech + touch, and gaze + button represent the best cascaded multimodal interfaces to reduce driver distraction for IVIS. INDEX TERMS Cascaded multimodal interface, driver distraction, head-up display (HUD), human-computer interaction (HCI), in-vehicle infotainment system (IVIS), learning effect, natural user interface (NUI).