The lack of data on flood events poses challenges in flood management. In this paper, we propose a novel approach to enhance flood-forecasting models by utilizing the capabilities of Generative Adversarial Networks (GANs) to generate synthetic flood events. We modified a time-series GAN by incorporating constraints related to mass conservation, energy balance, and hydraulic principles into the GAN model through appropriate regularization terms in the loss function and by using mass conservative LSTM in the generator and discriminator models. In this way, we can improve the realism and physical consistency of the generated extreme flood-event data. These constraints ensure that the synthetic flood-event data generated by the GAN adhere to fundamental hydrological principles and characteristics, enhancing the accuracy and reliability of flood-forecasting and risk-assessment applications. PCA and t-SNE are applied to provide valuable insights into the structure and distribution of the synthetic flood data, highlighting patterns, clusters, and relationships within the data. We aimed to use the generated synthetic data to supplement the original data and train probabilistic neural runoff model for forecasting multi-step ahead flood events. t-statistic was performed to compare the means of synthetic data generated by TimeGAN with the original data, and the results showed that the means of the two datasets were statistically significant at 95% level. The integration of time-series GAN-generated synthetic flood events with real data improved the robustness and accuracy of the autoencoder model, enabling more reliable predictions of extreme flood events. In the pilot study, the model trained on the augmented dataset with synthetic data from time-series GAN shows higher NSE and KGE scores of NSE = 0.838 and KGE = 0.908, compared to the NSE = 0.829 and KGE = 0.90 of the sixth hour ahead, indicating improved accuracy of 9.8% NSE in multistep-ahead predictions of extreme flood events compared to the model trained on the original data alone. The integration of synthetic training datasets in the probabilistic forecasting improves the model’s ability to achieve a reduced Prediction Interval Normalized Average Width (PINAW) for interval forecasting, yet this enhancement comes with a trade-off in the Prediction Interval Coverage Probability (PICP).