Hydraulic models of water distribution systems (WDSs) need to be calibrated, so they can be used to help to make informed decisions. Usually, hydraulic model calibration follows an iterative process of comparing the simulation results from the model with field observations and making adjustments to model parameters to make sure an acceptable level of agreement between predicted and measured values (e.g., water pressure) has been achieved. However, the manual process can be time-consuming, and the termination criterion relies on the modeler’s judgment. Therefore, various optimization-based calibration methods have been developed. In this study, three different optimization methods, i.e., Sequential Least Squares Programming (SLSQP), a Genetic Algorithm (GA) and Differential Evolution (DE), are compared for calibrating the pipe roughness of WDS models. Their performance is investigated over four different decision variable set formulations with different levels of discretization of the search space. Results obtained from a real-world case study demonstrate that compared to traditional engineering practice, optimization is effective for hydraulic model calibration. However, a finer search space discretization does not necessarily guarantee better results; and when multiple methods lead to similar performance, a simpler method is better. This study provides guidance on method and formulation selection for calibrating WDS models.