Background: Mental workload is a critical consideration in complex man–machine systems design. Among various mental workload detection techniques, multimodal detection techniques integrating EEG and fNIRS signals have attracted considerable attention. However, existing EEG–fNIRS-based mental workload detection methods have certain defects, such as complex signal acquisition channels and low detection accuracy, which restrict their practical application.Method: The signal acquisition configuration was optimized and a more accurate and convenient EEG–fNIRS-based mental workload detection method was constructed. A classical MATB task was conducted with 20 participating volunteers. Subjective scale data, 64-channel EEG data, and two-channel fNIRS data were collected.Results: A higher number of EEG channels correspond to higher detection accuracy. However, there is no obvious improvement in accuracy once the number of EEG channels reaches 26, with a four-level mental workload detection accuracy of 78.25±4.71%. Partial results of physiological analysis verify the results of previous studies, such as that the θ power of EEG and concentration of O2Hb in the prefrontal region increase while the concentration of HHb decreases with task difficulty. It was further observed, for the first time, that the energy of each band of EEG signals was significantly different in the occipital lobe region, and the power of 𝛽1 and 𝛽2 bands in the occipital region increased significantly with task difficulty. The changing range and the mean amplitude of O2Hb in high-difficulty tasks were significantly higher compared with those in low-difficulty tasks.Conclusions: The channel configuration of EEG–fNIRS-based mental workload detection was optimized to 26 EEG channels and two frontal fNIRS channels. A four-level mental workload detection accuracy of 78.25±4.71% was obtained, which is higher than previously reported results. The proposed configuration can promote the application of mental workload detection technology in military, driving, and other complex human–computer interaction systems.