Document Type : Review Article
Authors
1
Department of Information Technology Engineering, Faculty of Industrial and Systems Engineering, Tarbiat Modares University, Tehran, Iran
2
Department of Systems and Productivity Management. Faculty of Industrial and Systems Engineering. Tarbiat Modares University, Tehran, Iran
3
Department of Industrial Engineering, Faculty of Industrial and Systems Engineering, Tarbiat Modares University, Tehran, Iran
4
Department of Socio-economic Systems. Faculty of Industrial and Systems Engineering, Tarbiat Modares University, Tehran, Iran
Abstract
This paper aims to provide is to identify and categorize the most important data quality problems and determine the approaches proposed to solve this challenge in the process of mining. The method used in this research is a systematic review that has been conducted to analyze all valid evidence to answer the research questions. In this way, we reviewed and explored 102 academic research between 2007 and 2021, including studies published in conferences, journals, and a number of dissertations. The results showed that 20 data quality issues were reviewed in the literature. We categorized these issues into five levels: trace, event, case, activity, and timestamp, and identified four fundamental approaches used by studies to evaluate and address data quality issues in process mining, including: 1) data quality frameworks 2) preprocessing 3) anomaly detection 4) repair. Despite appropriate efforts to improve the quality of process mining input data, it is proposed to explore and develop new methods for high complexity data in real business processes.
Keywords
Main Subjects
Send comment about this article