The necessity for accurate and computationally efficient representations of water in atomistic simulations that can span biologically relevant timescales has born the necessity of coarse-grained (CG) modeling. Despite numerous advances, CG water models rely mostly on a-priori specified assumptions. How these assumptions affect the model accuracy, efficiency, and in particular transferability, has not been systematically investigated. Here we propose a data driven comparison and selection for CG water models through a Hierarchical Bayesian framework. We examine CG water models that differ in their level of coarse-graining, structure, and number of interaction sites. We find that the importance of electrostatic interactions for the physical system under consideration is a dominant criterion for the model selection. Multi-site models are favored, unless the effects of water in electrostatic screening are not relevant, in which case the single site model is preferred due to its computational savings. The charge distribution is found to play an important role in the multi-site model’s accuracy while the flexibility of the bonds/angles may only slightly improve the models. Furthermore, we find significant variations in the computational cost of these models. We present a data informed rationale for the selection of CG water models and provide guidance for future water model designs.