The evaluation of explanation methods is a research topic that has not yet been explored deeply, however, since explainability is supposed to strengthen trust in artificial intelligence, it is necessary to systematically review and compare explanation methods in order to confirm their correctness. Until now, no tool exists that exhaustively and speedily allows researchers to quantitatively evaluate explanations of neural network predictions. To increase transparency and reproducibility in the field, we therefore built Quantus -a comprehensive, open-source toolkit in Python that includes a growing, well-organised collection of evaluation metrics and tutorials for evaluating explainable methods. The toolkit has been thoroughly tested and is available under open source license on PyPi (or on https://github.com/understandable-machine-intelligence-lab/quantus/).
<p>Explainable artificial intelligence (XAI) methods serve as a support for researchers to shed light onto the reasons behind the predictions made by deep neural networks (DNNs). XAI methods have already been successfully applied to climate science, revealing underlying physical mechanisms inherent in the studied data. However, the evaluation and validation of XAI performance is challenging as explanation methods often lack ground truth. As the number of XAI methods is growing, a comprehensive evaluation is necessary to enable well-founded XAI application in climate science.</p> <p>In this work we introduce explanation evaluation in the context of climate research. We apply XAI evaluation to compare multiple explanation methods for a multi-layer percepton (MLP) and a convolutional neural network (CNN). Both MLP and CNN assign temperature maps to classes based on their decade. We assess the respective explanation methods using evaluation metrics measuring robustness, faithfulness, randomization, complexity and localization. Based on the results of a random baseline test we establish an explanation evaluation guideline for the climate community. We use this guideline to rank the performance in each property of similar sets of explanation methods for the MLP and CNN. Independent of the network type, we find that Integrated Gradients, Layer-wise relevance propagation and InputGradients exhibit a higher robustness, faithfulness and complexity compared to purely Gradient-based methods, while sacrificing reactivity to network parameters, i.e. low randomisation scores. The contrary holds for Gradient, SmoothGrad, NoiseGrad and FusionGrad. Another key observation is that explanations using input perturbations, such as SmoothGrad and Integrated Gradients, do not improve robustness and faithfulness, in contrast to theoretical claims. Our experiments highlight that XAI evaluation can be applied to different network tasks and offers more detailed information about different properties of explanation method than previous research. We demonstrate that using XAI evaluation helps to tackle the challenge of choosing an explanation method.</p>
<p>In climate change research we are dealing with a chaotic system, usually leading to huge computational efforts in order to make faithful predictions. Deep neural networks (DNNs) offer promising new approaches due to their computational efficiency and universal solution properties. However, despite the increase in successful application cases with DNNs, the black-box nature of such purely data-driven approaches limits their trustworthiness and therefore the useability of deep learning in the context of climate science.</p><p>The field of explainable artificial intelligence (XAI) has been established to enable a deeper understanding of the complex, highly-nonlinear methods and their predictions. By shedding light onto the reasons behind the predictions made by DNNs, XAI methods can serve as a support for researchers to reveal the underlying physical mechanisms and properties inherent in the studied data. Some XAI methods have already been successfully applied to climate science, however, no detailed comparison of their performances is available. As the number of XAI methods on the one hand, and DNN applications on the other hand are growing, a comprehensive evaluation is necessary in order to understand the different XAI methods in the climate context.</p><p>In this work we provide an overview of different available XAI methods and their potential applications for climate science. Based on a previously published climate change prediction task, we compare several explanation approaches, including model-aware (e.g. Saliency, IntGrad, LRP) and model-agnostic methods (e.g. SHAP). We analyse their ability to verify the physical soundness of the DNN predictions as well as their ability to uncover new insights into the underlying climate phenomena. Another important aspect we address in our work is the possibility to assess the underlying uncertainties of DNN predictions using XAI methods. This is especially crucial in climate science applications where uncertainty due to natural variability is usually large. To this end, we investigate the potential of two recently introduced XAI methods &#8212;UAI+ and NoiseGrad, which have been designed to include uncertainty information of the predictions into the explanations. We demonstrate that those XAI methods enable more stable explanations with respect to model noise and can further deal with uncertainties of network information. We argue that these methods are therefore particularly suitable for climate science application cases.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.