Building Integrated Photovoltaics (BIPV) is a promising technology to decarbonize urban energy systems via harnessing solar energy available on building envelopes. Nevertheless, handling the trade-off between effort, speed and spatial-temporal resolution for 3D BIPV solar potential evaluation in a complex urban context has always been a challenging task. Existing physics-based solar simulation engines require significant manual modelling effort and computing time to obtain high-resolution deterministic results. Yet, solar irradiation is highly intermittent and representing its inherent uncertainty may be required for designing robust energy systems. Targeting these drawbacks, this paper proposes a data-driven model based on Deep Generative Networks (DGN) to efficiently generate high-fidelity stochastic ensembles of annual hourly urban solar irradiance time-series data with uncompromised spatial-temporal resolution at the urban scale. It requires only easily accessible data inputs, i.e., simple fisheye images as categorical masks, such as captured from Level of Details (LOD) 1 urban geometry models. Our validations exemplify the high fidelity of the generated solar time series when compared to the physics-based simulator. To demonstrate the model’s relevance for urban energy design, we apply it to the resilient design of a district multi-energy system (MES) with several hundreds of BIPV surfaces. Furthermore, we showcase the models’ potential for generative design by parametrically altering the urban environment and producing corresponding irradiation time-series in real-time.