Machine learning (ML) is increasingly considered the solution to
environmental problems where only limited or no physico-chemical process
understanding is available. But when there is a need to provide support
for high-stake decisions, where the ability to explain possible
solutions is key to their acceptability and legitimacy, ML can come
short. Here, we develop a method, rooted in formal sensitivity analysis
(SA), that can detect the primary controls on the outputs of ML models.
Unlike many common methods for explainable artificial intelligence
(XAI), this method can account for complex multi-variate distributional
properties of the input-output data, commonly observed with
environmental systems. We apply this approach to a suite of ML models
that are developed to predict various water quality variables in a
pilot-scale experimental pit lake.
A critical finding is that subtle alterations in the design of an ML
model (such as variations in random seed for initialization, functional
class, hyperparameters, or data splitting) can lead to entirely
different representational interpretations of the dependence of the
outputs on explanatory inputs. Further, models based on different ML
families (decision trees, connectionists, or kernels) seem to focus on
different aspects of the information provided by data, although
displaying similar levels of predictive power. Overall, this underscores
the importance of employing ensembles of ML models when explanatory
power is sought. Not doing so may compromise the ability of the analysis
to deliver robust and reliable predictions, especially when generalizing
to conditions beyond the training data.