Abstract. Tropospheric ozone is a toxic greenhouse gas with a highly variable spatial distribution which is challenging to map on a global scale.
Here, we present a data-driven ozone-mapping workflow generating a transparent and reliable product.
We map the global distribution of tropospheric ozone from sparse, irregularly placed measurement stations to a high-resolution regular grid using machine learning methods.
The produced map contains the average tropospheric ozone concentration of the years 2010–2014 with a resolution of 0.1∘ × 0.1∘.
The machine learning model is trained on AQ-Bench (“air quality benchmark dataset”), a pre-compiled benchmark dataset consisting of multi-year ground-based ozone measurements combined with an abundance of high-resolution geospatial data. Going beyond standard mapping methods, this work focuses on two key aspects to increase the integrity of the produced map.
Using explainable machine learning methods, we ensure that the trained machine learning model is consistent with commonly accepted knowledge about tropospheric ozone.
To assess the impact of data and model uncertainties on our ozone map, we show that the machine learning model is robust against typical fluctuations in ozone values and geospatial data.
By inspecting the input features, we ensure that the model is only applied in regions where it is reliable. We provide a rationale for the tools we use to conduct a thorough global analysis.
The methods presented here can thus be easily transferred to other mapping applications to ensure the transparency and reliability of the maps produced.