Predictive business process monitoring methods exploit historical process execution logs to provide predictions about running instances of a process. These predictions enable process workers and managers to preempt performance issues or compliance violations. A number of approaches have been proposed to predict quantitative process performance indicators for running instances of a process, including remaining cycle time, cost, or probability of deadline violation. However, these approaches adopt a black-box approach, insofar as they predict a single scalar value without decomposing this prediction into more elementary components. In this paper, we propose a white-box approach to predict performance indicators of running process instances. The key idea is to first predict the performance indicator at the level of activities and then to aggregate these predictions at the level of a process instance by means of flow analysis techniques. The paper develops this idea in the context of predicting the remaining cycle time of ongoing process instances. The proposed approach has been evaluated on real-life event logs and compared against several baselines. KEYWORDS explainable artificial intelligence, flow analysis, predictive process monitoring, process mining, transparent models
INTRODUCTIONPredictive business process monitoring techniques seek to predict the future state or properties of ongoing executions of a process based on models extracted from historical event logs. A wide range of predictive business process monitoring techniques have been proposed to predict, for example, compliance violations, 1,2 the next activity or the remaining sequence of activities of a process instance, 3,4 or quantitative process performance indicators, such the remaining cycle time of a process instance. 5-7 These predictions can be used to alert process workers to problematic process instances or to support resource allocation decisions, eg, to allocate additional resources to instances that are at risk of a deadline violation.This article addresses the problem of predicting quantitative process performance indicators, with a specific focus on predicting the remaining cycle time of ongoing process instances. Existing approaches to this problem adopt a "black-box" approach by building stochastic models or regression models which, given a process instance, predict the remaining execution time as a single scalar value, without seeking to explain this prediction in terms of more elementary components. Yet, quantitative performance indicators such as cost or time are aggregations of corresponding performance indicators of the activities composing the process. For example, the cycle time of a process instance with sequentially performed activities consists of the sum of the cycle time of the activities performed in that process instance. In this respect, existing known as flow analysis. 8 The idea of flow analysis is to estimate a quantitative performance indicator at the level of a process by aggregating the estimated values of this perfo...