We propose a new solution to the blind source separation problem that factors mixed time-series signals into a sum of spatiotemporal modes, with the constraint that the temporal components are intrinsic mode functions (IMF's). The key motivation is that IMF's allow the computation of meaningful Hilbert transforms of non-stationary data, from which instantaneous time-frequency representations may be derived. Our spatiotemporal intrinsic mode decomposition (STIMD) method leverages spatial correlations to generalize the extraction of IMF's from one-dimensional signals, commonly performed using the empirical mode decomposition (EMD), to multi-dimensional signals.Further, this data-driven method enables future-state prediction. We demonstrate STIMD on several synthetic examples, comparing it to common matrix factorization techniques, namely singular value decomposition (SVD), independent component analysis (ICA), and dynamic mode decomposition (DMD). We show that STIMD outperforms these methods at reconstruction and extracting interpretable modes. Next, we apply STIMD to analyze two real-world datasets, gravitational wave data and neural recordings from the rodent hippocampus.