As human–machine teams are being considered for a variety of mixed‐initiative tasks, detecting and being responsive to human cognitive states, in particular systematic cognitive states, is among the most critical capabilities for artificial systems to ensure smooth interactions with humans and high overall team performance. Various human physiological parameters, such as heart rate, respiration rate, blood pressure, and skin conductance, as well as brain activity inferred from functional near‐infrared spectroscopy or electroencephalogram, have been linked to different systemic cognitive states, such as workload, distraction, or mind–wandering among others. Whether these multimodal signals are indeed sufficient to isolate such cognitive states across individuals performing tasks or whether additional contextual information (e.g., about the task state or the task environment) is required for making appropriate inferences remains an important open problem.In this paper, we introduce an experimental and machine learning framework for investigating these questions and focus specifically on using physiological and neurophysiological measurements to learn classifiers associated with systemic cognitive states like cognitive load, distraction, sense of urgency, mind wandering, and interference. Specifically, we describe a multitasking interactive experimental setting used to obtain a comprehensive multimodal data set which provided the foundation for a first evaluation of various standard state‐of‐the‐art machine learning techniques with respect to their effectiveness in inferring systemic cognitive states. While the classification success of these standard methods based on just the physiological and neurophysiological signals across subjects was modest, which is to be expected given the complexity of the classification problem and the possibility that higher accuracy rates might not in general be achievable, the results nevertheless can serve as a baseline for evaluating future efforts to improve classification, especially methods that take contextual aspects such as task and environmental states into account.