Manned-Unmanned Teaming (MUM-T) can be defined as the teaming of aerial robots (artificial agents) along with a human pilot (natural agent), in which the human agent is not an authoritative controller but rather a cooperative team player. To our knowledge, no study has yet evaluated the impact of MUM-T scenarios on operators' mental workload (MW) using a neuroergonomic approach (i.e., using physiological measures), nor provided a MW estimation through classification applied on those measures. Moreover, the impact of the non-stationarity of the physiological signal is seldom taken into account in classification pipelines, particularly regarding the validation design. Therefore this study was designed with two goals: (i) to characterize and estimate MW in a MUM-T setting based on physiological signals; (ii) to assess the impact of the validation procedure on classification accuracy. In this context, a search and rescue (S&R) scenario was developed in which 14 participants played the role of a pilot cooperating with three UAVs (Unmanned Aerial Vehicles). Missions were designed to induce high and low MW levels, which were evaluated using self-reported, behavioral and physiological measures (i.e., cerebral, cardiac, and oculomotor features). Supervised classification pipelines based on various combinations of these physiological features were benchmarked, and two validation procedures were compared (i.e., a traditional one that does not take time into account vs. an ecological one that does). The main results are: (i) a significant impact of MW on all measures, (ii) a higher intra-subject classification accuracy (75%) reached using ECG features alone or in combination with EEG and ET ones with the Adaboost, Linear Discriminant Analysis or the Support Vector Machine classifiers. However this was only true with the traditional validation. There was a significant drop in classification accuracy using the ecological one. Interestingly, inter-subject classification with ecological validation (59.8%) surpassed both intra-subject with ecological and inter-subject with traditional validation. These results highlight the need for further developments to perform MW monitoring in such operational contexts.