Current whole-brain models are generally tailored to the modelling of a particular modality of data (e.g., fMRI or MEG/EEG). Although different imaging modalities reflect different aspects of neural activity, we hypothesise that this activity arises from common network dynamics. Building on the universal principles of self-organising delay-coupled nonlinear systems, we aim to link distinct electromagnetic and metabolic features of brain activity to the dynamics on the macroscopic structural connectome. To jointly predict dynamical and functional connectivity features of distinct signal modalities, we consider two large-scale models generating local short-lived 40 Hz oscillations with various degrees of realism - namely Stuart Landau (SL) and Wilson and Cowan (WC) models. To this end, we measure features of functional connectivity and metastable oscillatory modes (MOMs) in fMRI and MEG signals - and compare them against simulated data. We show that both models can represent MEG functional connectivity (FC) and functional connectivity dynamics (FCD) to a comparable degree, by varying global coupling and mean conduction time delay. For both models, the omission of delays dramatically decreased the performance. For fMRI, the SL model performed worse for FCD, highlighting the importance of balanced dynamics for the emergence of spatiotemporal patterns of ultra-slow dynamics. Notably, optimal working points varied across modalities and no model was able to achieve a correlation with empirical FC higher than 0.45 across modalities for the same set of parameters. Nonetheless, both displayed the emergence of FC patterns beyond the anatomical framework. Finally, we show that both models can generate MOMs with empirical-like properties. Our results demonstrate the emergence of static and dynamic properties of neural activity at different timescales from networks of delay-coupled oscillators at 40 Hz. Given the higher dependence of simulated FC on the underlying structural connectivity, we suggest that mesoscale heterogeneities in neural circuitry may be critical for the emergence of parallel cross-modal functional networks and should be accounted for in future modelling endeavours.