Building a real-time, cost-effective, spatio-temporal forecasting system is a challenging problem with many practical applications such as traffic and road network management. Most forecasting research focuses on average prediction quality, with less attention paid to building practical pipelines and achieving timely and accurate forecasts when the network is under heavy load. Additionally, transport authorities need to leverage dynamic data sources (e.g., scheduled roadworks) and vehicle-level flow data, while also supporting ad-hoc inference workloads at low cost. The cloud-based system Foresight, developed in collaboration with Transport for the West Midlands (TfWM), is able to ingest, aggregate and process streamed traffic data, as well as dynamic urban events/flow data to produce regularly scheduled forecasts with high accuracy. In this work, we extend our system with several novel enhancements. First, we present an efficient method for extending the forecasting scale, enabling transport managers to predict traffic patterns further into the future than existing methods. In addition, we augment the existing inference architecture with a new, fully serverless design. This offers a more cost-effective inference solution, which seamlessly handles sporadic inference workloads over multiple forecasting models. We observe that Graph Neural Network (GNN) forecasting models are robust to extensions of the forecasting scale, achieving consistent (and sometimes even improved) performance up to 24 hours ahead. This is in contrast to the 1 hour forecasting horizons popularly considered in the literature. Further, our serverless inference solution is shown to be significantly more cost-effective than provisioned alternatives in appropriate use-cases. We identify the optimal memory configuration of serverless resources to achieve an attractive cost-to-performance ratio.