Intermittent solar irradiance due to passing clouds poses challenges for integrating solar energy into existing infrastructure. By making use of intra‐hour nowcasts (very short‐term forecasts), changing conditions of solar irradiance can be anticipated. All‐sky imagers, capturing sky conditions at high spatial and temporal resolution, can be the basis of such nowcasting systems. In this work, a deep learning (DL) model for solar irradiance nowcasts based on the transformer architecture is presented. The model is trained end‐to‐end using sequences of sky images and irradiance measurements as input to generate point‐forecasts up to 20 min ahead. Further, the effect of integrating this model into a hybrid system, consisting of a physics‐based model and smart persistence, is examined. A comparison between the DL and two hybrid models (with and without the DL model) is conducted on a benchmark dataset. Forecast accuracy for deterministic point‐forecasts is analyzed under different conditions using standard error metrics like root‐mean‐square error (RMSE) and forecast skill (FS). Furthermore, spatial and temporal aggregation effects are investigated. In addition, probabilistic nowcasts for each model are computed via a quantile approach. Overall, the DL model outperforms both hybrid models under the majority of conditions and aggregation effects.This article is protected by copyright. All rights reserved.