Forecasting of urban water demand is essential today for managing water supply systems (WSSs). A reliable knowledge of the future water demand supports taking informed operational, tactical and strategical decisions. However, the stochastic nature of water demand makes the development of a robust forecasting model a challenging task (House-Peters & Chang, 2011). In the past years, machine learning methods gained considerable attention in the forecasting field, due to the increasing availability of computational resources and data (Ghalehkhondabi et al., 2017). This trend is likely to increase in the future, thanks to the new era of big data coming from smart meters (Nguyen et al., 2018). Many authors successfully developed powerful methods to forecast water demand, providing innovative solutions to water utilities. For instance, Herrera et al. ( 2010) considered multiple machine learning methods to develop a prediction model for the water demand in a city in Spain. Apart from exhibiting the superior performances of the support vector regression method on the case study, the authors also used the prediction output for a hydraulic model highlighting the importance of the forecasting application.Artificial neural networks (ANNs) have been widely used, especially for short-term forecasting (Donkor et al., 2014). Among all the different architectures, the feedforward ANN has been consistently adopted due to its powerful performance with a relatively easy implementation. This is the case of the works from Bougadis et al. (2005), Adamowski and Karapataki (2010), Adamowski et al. (2012), Romano and Kapelan (2014), and Pesantez et al. ( 2020). Among plenty of relevant studies on water demand forecasting based on ANNs, it highlights the work of Ghiassi et al. (2008), who developed a dynamic artificial neural network model to predict daily, weekly and monthly water demand. Besides, Ghiassi et al. (2008) applied the model to real water demand data, successfully obtaining a reliable forecast for the three different time horizons. Finally, the proposed dynamic artificial neural network was compared with more conventional methods resulting in an overall better performance. However, many different neural network architectures have been used to develop powerful prediction models. Among them, the recurrent ANN model showed significantly better results. For instance, Guo et al. (2018) proposed a deep learning model based on the gated recurrent unit (GRU) architecture to forecast the water demand data of two district metering areas (DMAs) with a 15-min time step. The results showed how the GRU-based model outperformed the conventional ANNs. Furthermore, Mu et al. (2020) developed a long short-term memory (LSTM) architecture to predict water demands of a WSS in China. The proposed model was