The ILAMB version 1 (v1) and ILAMB version 2 (v2) benchmarking systems compare model results with best-available observational data products, focusing on atmospheric CO 2 , surface fluxes, hydrology, soil carbon and nutrient biogeochemistry, ecosystem processes and states, and vegetation dynamics. ILAMBv2 is expected to become an integral part of the workflow for model frameworks, including the Accelerated Climate Modeling for Energy (ACME) model and the Community Earth System Model (CESM). Moreover, ILAMBv2 will contribute model analysis and evaluation capabilities to phase 6 of the Coupled Model Intercomparison Project (CMIP6) and future model and model-data intercomparison projects.
Benchmarking Challenges and PrioritiesA variety of statistical approaches have been adopted to evaluate model accuracy through comparison with observations, including calculations of bias, root-mean-square error (RMSE), phase, amplitude, spatial distribution, Taylor diagrams and scores, functional relationship metrics, and perturbation and sensitivity tests. While many of these statistical measures are not independent, each provides slightly different information about contemporary model performance with respect to observational data and about implications for future projections from ESMs.However, developing metrics that make appropriate use of observational data remains a scientific challenge because of the spatial and temporal mismatch between models and measurements, poorly characterized uncertainties in observationally constrained data products, biases in reanalysis and forcing data, model simplifications, and structural and parametric uncertainties. A variety of benchmarking challenges and opportunities emerged from workshop breakout group meeting reports. Common themes included the following: › Need for collocated measurements, particularly around a core set of AmeriFlux and FLUXNET sites with a sustained record of observations for repeated model testing › Lack of quantified uncertainty information for observational data › Utility of functional response metrics and variable-to-variable comparisons › Value of metrics for future projections based on emergent constraints v › Unrealized opportunities for global observational data sets based on satellite remote sensing synthesized with ancillary databases, using new algorithms › Importance of applying statistical and machine learning methods to upscaling sparse measurements from sites to regions to the globe › Need for process-level benchmarks and metrics for extreme events › Opportunities for collaboration with earth system model developers (e.g., ACME, CESM, and others)Opportunities for collaboration with important field and laboratory experiments and monitoring activities, including AmeriFlux and FLUXNET, Integrated Carbon Observation System (ICOS), Next Generation Ecosystem Experiments (NGEE) Arctic, Arctic-Boreal Vulnerability Experiment (ABoVE), Spruce and Peatland Responses Under Climatic and Environmental Change (SPRUCE) project, Critical Zone Observatories (CZOs), Long-Te...