Predictive process monitoring techniques aim to forecast outcomes of running process instances. These techniques are based on using predictive models built from past observed behavior, i.e., in an offline setting. However, process behavior usually changes over time and predictive models are therefore at risk of becoming obsolete. Because of this, the definition of systems that build predictive models through an online setting has recently gained attention. Nevertheless, the scalability of this kind of setting within a context where the amount of data available is experiencing rapid growth is an outstanding issue. To solve this problem, this paper aims to define a framework for event sequence prediction capable of taking advantage of modern distributed processing platforms. An implementation over this framework based on Apache Flink is presented and it is tested upon two different case studies to prove its validity and its capacity to scale.INDEX TERMS Data mining, distributed processing, monitoring, predictive models.