We are interested in predicting the onset of a disease D, based on several risk factors. For that purpose, two classes of techniques are available, whose properties are quite different in terms of interpretation, which is the focus of this paper :1. Classical Statistics (for example: Generalized Linear Models (GLM)).
Neural Networks (NN) (or more generally Artificial Intelligence (AI)).Both methods are rather good at prediction, with a preference for Neural Network when the dimension of the potential predictors is high. But the advantage of the classical statistics is cognitive : the role of each factor is generally summarized in the value of a coefficient which is highly positive for a harmful factor, close to 0 for an irrelevant factor, and highly negative for a beneficial one. While the underlying model in a neural network approach mixes repeatedly all factors together so that it is rather difficult to summarize the effect of each factor. However, we can reach some insight into interpretation of the respective impact of each risk factor using several algorithms In particular, we can distort the data set, doing sequential permutations of the risk factors. If the prediction performance of the neural network is stable, this means that the corresponding factor is irrelevant. Conversely, if the quality of the prediction decreases, the impact of the corresponding risk factor may be considered as proportional to this decrement.