Traditionally, groundwater and surface water flow models have been calibrated against two observation types: hydraulic heads and surface water discharge. It has repeatedly been demonstrated, however, that these classical observations do not contain sufficient information to calibrate flow models. To reduce the predictive uncertainty of flow models, the consideration of other observation types constitutes a promising way forward. Despite the ever-increasing availability of other observation types, however, they are still unconventional when it comes to flow model calibration. By reviewing studies that included nonclassical observations in flow model calibration, benefits and challenges associated with their integration in flow model calibration were identified, and their information content was analyzed. While explicit simulation of mass transport processes in flow models poses challenges, even simplified approaches to integrate tracer concentrations yield significantly better calibration results than using only classical observations. For a majority of calibrated flow models, observations of tracer concentrations and of exchange fluxes were beneficial. Temperature observations improved the simulation of heat transport but often worsened all other model outcomes. Only when temperature observations were made within 2 m of the surface water-groundwater interface did they have the potential to also improve flow and mass transport simulations. Surprisingly, many models were calibrated manually rather than with the widely available, mathematically robust and automated tools. There is a clear need for more systematic implementation of unconventional observations and automated flow model calibration as well as for more systematic quantification of the information content of unconventional observations. Plain Language Summary Traditionally, groundwater and surface water flow models, which are critical for water resources assessment, have been calibrated against only two classical observation types: groundwater levels and surface water discharge. In the past, it has repeatedly been demonstrated that these classical observations do not contain sufficient information to calibrate the parameters required for the simulation of groundwater and surface water flow systems. Owing to the rapid development of measurement techniques throughout the last three decades, however, many other observations of hydrological systems have become widely available. Despite this, observation types other than the classical ones are still unconventional when it comes to flow model calibration. The overall goal of this review is to identify optimal observation types and procedures for flow model calibration and hydrological predictions. We found that observations of tracer concentrations and exchange fluxes are beneficial for most flow models. Temperatures improve the simulation of heat transport but often worsen other flow model outcomes, unless temperatures are measured within 2 m of the surface water-groundwater interface. We identified a need for...