Earth system modelling relies on contributions from groups who develop models and from those involved in devising, executing, and exploiting numerical experiments. Often these people work in different institutions, and they may communicate primarily via published information (whether journal papers, technical notes, or websites). The complexity of the models, experiments, and methodologies, along with the diversity (and sometimes inexact nature) of information sources can easily lead to misinterpretation of what was actually intended or done. In this paper we introduce a taxonomy of terms for more 5 clearly defining numerical experiments, put it in the context of previous work on experimental ontologies, and describe how we have used it to document the CMIP6 experiments. We describe how this process involved iteration with a range of CMIP6 stakeholders to rationalise multiple sources of information and add clarity to experimental definitions. We demonstrate how this process has added value to CMIP6 itself by a) helping those devising experiments to be clear about their goals and expected methodology, b) making it easier for those executing experiments to know what was intended, c) exposing inter-relationships 10 between experiments, and d) making it clearer for third parties (data users) to understand the CMIP6 experiments. We conclude with some lessons learned, and how these may be applied for any modelling campaign as well as future CMIP phases.
IntroductionClimate modelling involves the use of models to carry out simulations of the real world, usually as part of an experiment aimed at understanding processes, hypothesis testing, or projecting some future climate system behaviour. Such numerical experi-15 ments can be organized into "Model Intercomparison Projects" (MIPs) in which participants execute common experiments and share results. Perhaps the best known of these are the CMIP series of Climate Model Intercomparison Projects, of which the latest is CMIP6 .The design, documentation and accompanying protocols have all evolved over time, reflecting both an increasing scope and wider-spread interest, and two important new constituencies: (1) Those who have organised "Diagnostic MIPs", which do not 20 require new experiments, but rather request specific output from existing planned experiments to address specific interests; and(2) an even wider group of downstream users who use the CMIP data opportunistically, having little or no direct contact with the modelling groups.