Deep learning models can effectively capture the non-linear spatiotemporal dynamics of city-wide traffic forecasting. Evidence of varying deep learning model performance between different cities, different prediction horizons, different scales, specific city regions, and during particular hours of the day abounds in the literature on deep learning-based traffic prediction, yet a unified metric to quantify the complexity of different prediction tasks does not exist. This paper proposes two metrics—model complexity (MC) and intrinsic complexity (IC). While MC quantifies the effective complexity of deep learning models for city-wide traffic prediction tasks, the IC quantifies the underlying complexity of the prediction task. Being an effective complexity metric, MC depends on the model and the data. The IC depends only on the data and is invariant to the model being used. Both metrics are validated through systematic experimentation using traffic volume data from three cities. Finally, we demonstrate how these metrics can improve the workflows for deep learning-based data-driven traffic prediction pipelines and deployment by reducing the hyperparameter search scope and comparing the effectiveness of different design pathways.