Satellite‐based optical video sensors are poised as the next frontier in remote sensing. Satellite video offers the unique advantage of capturing the transient dynamics of floods with the potential to supply hitherto unavailable data for the assessment of hydraulic models. A prerequisite for the successful application of hydraulic models is their proper calibration and validation. In this investigation, we validate 2D flood model predictions using satellite video‐derived flood extents and velocities. Hydraulic simulations of a flood event with a 5‐year return period (discharge of 722 m3 s−1) were conducted using Hydrologic Engineering Center—River Analysis System 2D in the Darling River at Tilpa, Australia. To extract flood extents from satellite video of the studied flood event, we use a hybrid transformer‐encoder, convolutional neural network (CNN)‐decoder deep neural network. We evaluate the influence of test‐time augmentation (TTA)—the application of transformations on test satellite video image ensembles, during deep neural network inference. We employ Large Scale Particle Image Velocimetry (LSPIV) for non‐contact‐based river surface velocity estimation from sequential satellite video frames. When validating hydraulic model simulations using deep neural network segmented flood extents, critical success index peaked at 94% with an average relative improvement of 9.5% when TTA was implemented. We show that TTA offers significant value in deep neural network‐based image segmentation, compensating for aleatoric uncertainties. The correlations between model predictions and LSPIV velocities were reasonable and averaged 0.78. Overall, our investigation demonstrates the potential of optical space‐based video sensors for validating flood models and studying flood dynamics.