Bayesian optimization has become a widely used tool in the optimization and machine learning communities. It is suitable to problems as simulation/optimization and/or with an objective function computationally expensive to evaluate. Bayesian optimization is based on a surrogate probabilistic model of the objective whose mean and variance are sequentially updated using the observations and an "acquisition" function based on the model, which sets the next observation at the most "promising" point. The most used surrogate model is the Gaussian Process which is the basis of well-known Kriging algorithms. In this paper, the authors consider the pump scheduling optimization problem in a Water Distribution Network with both ON/OFF and variable speed pumps. In a global optimization model, accounting for time patterns of demand and energy price allows significant cost savings. Nonlinearities, and binary decisions in the case of ON/OFF pumps, make pump scheduling optimization computationally challenging, even for small Water Distribution Networks. The well-known EPANET simulator is used to compute the energy cost associated to a pump schedule and to verify that hydraulic constraints are not violated and demand is met. Two Bayesian Optimization approaches are proposed in this paper, where the surrogate model is based on a Gaussian Process and a Random Forest, respectively. Both approaches are tested with different acquisition functions on a set of test functions, a benchmark Water Distribution Network from the literature and a large-scale real-life Water Distribution Network in Milan, Italy.