Process mining (PM) exploits event logs to obtain meaningful information about the processes that produced them. As the number of applications developed on cloud infrastructures is increasing, it becomes important to study and discover their underlying processes. However, many current PM technologies face challenges in dealing with complex and large event logs from cloud applications, especially when they have little structure (e.g., clickstreams). By using Design Science Research, this paper introduces a new method, called cloud pattern API-process mining (CPA-PM), which enables the discovery and analysis of cloud-based application processes using PM in a way that addresses many of these challenges. CPA-PM exploits a new application programming interface, with an R implementation, for creating repeatable scripts that preprocess event logs collected from such applications. Applying CPA-PM to a case with real and evolving event logs related to the trial process of a software-as-a-service cloud application led to useful analyses and insights, with reusable scripts. CPA-PM helps producing executable scripts for filtering event logs from clickstream and cloud-based applications, where the scripts can be used in pipelines while minimizing the need for error-prone and time-consuming manual filtering.
Background: Process mining (PM) exploits event logs to obtain meaningful information about the processes that produced them. As the number of applications developed on cloud infrastructures is increasing, it becomes important to study and discover their underlying processes. However, many current PM technologies face challenges in dealing with complex and large event logs from cloud applications, especially when they have little structure (e.g., clickstreams). Methods: Using Design Science Research, this paper introduces a new method, called Cloud Pattern API – Process Mining (CPA-PM), that enables discovering and analyzing cloud-based application processes using PM in a way that addresses many of these challenges. CPA-PM exploits a new application programming interface (API), with an R implementation, for creating repeatable scripts that preprocess event logs collected from such applications. Results: Applying CPA-PM to a case with real and evolving event logs related to the trial process of a Software-as-a-Service cloud application led to useful analyses and insights, with reusable scripts. Conclusion: CPA-PM helps producing executable scripts for filtering event logs from clickstream and cloud-based applications, where the scripts can be used in pipelines while minimizing the need for error-prone and time-consuming manual filtering.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.