Abstract. As organizations reach higher levels of Business Process Management maturity, they tend to accumulate large collections of process models. These repositories may contain thousands of activities and be managed by different stakeholders with varying skills and responsibilities. However, while being of great value, these repositories induce high management costs. Thus, it becomes essential to keep track of the various model versions as they may mutually overlap, supersede one another and evolve over time. We propose an innovative versioning model, and associated storage structure, specifically designed to maximize sharing across process models and process model versions, reduce conflicts in concurrent edits and automatically handle controlled change propagation. The focal point of this technique is to version single process model fragments, rather than entire process models. Indeed empirical evidence shows that real-life process model repositories have numerous duplicate fragments. Experiments on two industrial datasets confirm the usefulness of our technique.
Abstract. Evidence exists that repositories of business process models used in industrial practice contain significant amounts of duplication. This duplication may stem from the fact that the repository describes variants of the same processes and/or because of copy/pasting activity throughout the lifetime of the repository. Previous work has put forward techniques for identifying duplicate fragments (clones) that can be refactored into shared subprocesses. However, these techniques are limited to finding exact clones. This paper analyzes the problem of approximate clone detection and puts forward two techniques for detecting clusters of approximate clones. Experiments show that the proposed techniques are able to accurately retrieve clusters of approximate clones that originate from copy/pasting followed by independent modifications to the copied fragments.
Automated process discovery techniques aim at extracting process models from information system logs. Existing techniques in this space are effective when applied to relatively small or regular logs, but generate spaghetti-like and sometimes inaccurate models when confronted to logs with high variability. In previous work, trace clustering has been applied in an attempt to reduce the size and complexity of automatically discovered process models. The idea is to split the log into clusters and to discover one model per cluster. This leads to a collection of process models -each one representing a variant of the business process -as opposed to an all-encompassing model. Still, models produced in this way may exhibit unacceptably high complexity and low fitness. In this setting, this paper presents a two-way divide-and-conquer process discovery technique, wherein the discovered process models are split on the one hand by variants and on the other hand hierarchically using subprocess extraction. Splitting is performed in a controlled manner in order to achieve user-defined complexity or fitness thresholds. Experiments on real-life logs show that the technique produces collections of models substantially smaller than those extracted by applying existing trace clustering techniques, while allowing the user to control the fitness of the resulting models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.