Abstract. Benthic oxygen isotope records are commonly used as a proxy for global mean
surface temperatures during the Late Cretaceous and Cenozoic, and the
resulting estimates have been extensively used in characterizing major
trends and transitions in the climate system and for analysing past climate
sensitivity. However, some fundamental assumptions governing this proxy have
rarely been tested. Two key assumptions are (a) benthic foraminiferal
temperatures are geographically well mixed and are linked to surface
high-latitude temperatures, and (b) surface high-latitude temperatures are well
correlated with global mean temperatures. To investigate the robustness of
these assumptions through geological time, we performed a series of 109
climate model simulations using a unique set of paleogeographical
reconstructions covering the entire Phanerozoic at the stage level. The
simulations have been run for at least 5000 model years to ensure that the
deep ocean is in dynamic equilibrium. We find that the correlation between
deep ocean temperatures and global mean surface temperatures is good for the
Cenozoic, and thus the proxy data are reliable indicators for this time
period, albeit with a standard error of 2 K. This uncertainty has not
normally been assessed and needs to be combined with other sources of
uncertainty when, for instance, estimating climate sensitivity based on
using δ18O measurements from benthic foraminifera. The
correlation between deep and global mean surface temperature becomes weaker
for pre-Cenozoic time periods (when the paleogeography is significantly
different from the present day). The reasons for the weaker correlation
include variability in the source region of the deep water (varying
hemispheres but also varying latitudes of sinking), the depth of ocean
overturning (some extreme warm climates have relatively shallow and sluggish
circulations weakening the link between the surface and deep ocean), and the
extent of polar amplification (e.g. ice albedo feedbacks). Deep ocean
sediments prior to the Cretaceous are rare, so extending the benthic foraminifera
proxy further into deeper time is problematic, but the model results
presented here would suggest that the deep ocean temperatures from such time
periods would probably be an unreliable indicator of global mean surface conditions.