This paper describes benchmark testing of six two-dimensional (2D) hydraulic models (DIVAST, DIVASTTVD, TUFLOW, JFLOW, TRENT and LISFLOOD-FP) in terms of their ability to simulate surface flows in a densely urbanised area. The models are applied to a 1·0 km × 0·4 km urban catchment within the city of Glasgow, Scotland, UK, and are used to simulate a flood event that occurred at this site on 30 July 2002. An identical numerical grid describing the underlying topography is constructed for each model, using a combination of airborne laser altimetry (LiDAR) fused with digital map data, and used to run a benchmark simulation. Two numerical experiments were then conducted to test the response of each model to topographic error and uncertainty over friction parameterisation. While all the models tested produce plausible results, subtle differences between particular groups of codes give considerable insight into both the practice and science of urban hydraulic modelling. In particular, the results show that the terrain data available from modern LiDAR systems are sufficiently accurate and resolved for simulating urban flows, but such data need to be fused with digital map data of building topology and land use to gain maximum benefit from the information contained therein. When such terrain data are available, uncertainty in friction parameters becomes a more dominant factor than topographic error for typical problems. The simulations also show that flows in urban environments are characterised by numerous transitions to supercritical flow and numerical shocks. However, the effects of these are localised and they do not appear to affect overall wave propagation. In contrast, inertia terms are shown to be important in this particular case, but the specific characteristics of the test site may mean that this does not hold more generally.