2022
DOI: 10.1145/3511707
|View full text |Cite
|
Sign up to set email alerts
|

Quality-Informed Process Mining: A Case for Standardised Data Quality Annotations

Abstract: Real-life event logs, reflecting the actual executions of complex business processes, are faced with numerous data quality issues. Extensive data sanity checks and pre-processing are usually needed before historical data can be used as input to obtain reliable data-driven insights. However, most of the existing algorithms in process mining, a field focusing on data-driven process analysis, do not take any data quality issues or the potential effects of data pre-processing into account explicitly. This can resu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(2 citation statements)
references
References 54 publications
0
2
0
Order By: Relevance
“…This could lead to misleading or inaccurate conclusions about the process under investigation. In [30], the authors proposed a range of quality annotations at event, trace and log levels to keep track of the data quality issues founded in an event log and also to record the extent of repairs are made to the event log as a result. Such metadata about data quality can assist in undertaking quality-informed process mining.…”
Section: Quality-informed Process Miningmentioning
confidence: 99%
“…This could lead to misleading or inaccurate conclusions about the process under investigation. In [30], the authors proposed a range of quality annotations at event, trace and log levels to keep track of the data quality issues founded in an event log and also to record the extent of repairs are made to the event log as a result. Such metadata about data quality can assist in undertaking quality-informed process mining.…”
Section: Quality-informed Process Miningmentioning
confidence: 99%
“…Existing techniques for data-driven process analysis, researched and practiced under the umbrella of process mining, are limited in their support for uncertain data. Specifically, event logs that are only stochastically known are often cleaned [4], [5] or transformed into less informative deterministically known logs by ad-hoc approaches for noise reduction, e.g., by maximum likelihood estimation or thresholding. As a consequence, data uncertainty is hidden from the actual analysis, which potentially introduces a bias and prevents any assessment of the confidence that should be put into the results and exposed to downstream applications.…”
Section: Introductionmentioning
confidence: 99%