Data Quality in Process Mining: A Systematic Review

Document Type : Review Article

Authors

1 Ph.D., Student, Department of Information Technology Engineering, Faculty of Industrial and Systems Engineering, Tarbiat Modares University, Tehran, Iran

2 Professor, Department of Systems and Productivity Management. Faculty of Industrial and Systems Engineering, Tarbiat Modares University, Tehran, Iran

3 Associate Professor, Department of Industrial Engineering, Faculty of Industrial and Systems Engineering, Tarbiat Modares University, Tehran, Iran

4 Associate Professor, Department of Socio-economic Systems, Faculty of Industrial and Systems Engineering, Tarbiat Modares University, Tehran, Iran.

Abstract

Purpose: Process mining connects the disciplines of data mining and machine learning to business process management techniques. A business process is a series of independent and interdependent activities that transform inputs (data, materials, etc.) using one or more resources (such as time, employees, and money). It utilizes the necessary outputs. It is possible to examine the actual behavior of organizations, including the performance of individuals, departments, and resources, using process analysis techniques. The results of the process analysis, which typically includes the organization's business process models, can be compared to the organization's documents and requirements. Thus, processes will be able to be compared, reviewed, monitored, and enhanced. Process mining methods operate based on event logs stored in information systems. Using process mining without high-quality input data will not result in accurate conclusions about an organization's business processes. In recent years, researchers have focused on the evaluation and enhancement of the quality of input data using process mining techniques. The objective of this study is to identify and categorize the most significant data quality issues, as well as recognize the approaches proposed to address this challenge in process mining.
Methods: This research employs a systematic review with the intent of analyzing all valid evidence in order to answer the research questions. This study investigates 102 academic studies published between 2007 and 2021, including conference papers, journal articles, and theses. Towards this end, a systematic three-part research methodology was employed. In the first section, which included the research definition, the research field was defined first, followed by the research objectives and queries. In the concluding step of this section, the research's scope is defined. In the second section, the research methodology and entry criteria for the studies discovered during the search for scientific resources are defined. Finally, the identified studies are evaluated in terms of their citations and classified. In the third section, which is devoted to the evaluation of the research, the concluding research of the study is conducted, and then, based on the investigation of the preceding studies, the findings and conclusions are determined. Important data and evidence were extracted from the collated research, allowing for the creation of the necessary tables and graphs.
Findings: In recent years, researchers have paid more attention to data quality challenges in the process mining, according to the findings of recent research. In 2019 and 2020, the greatest number of studies will have been published. It was also discovered that the majority of articles were published in three scientific databases, namely Springer, IEEE, and Elsevier. 51% of the studies examined were presented at prestigious conferences. 36% of the studies were published in prestigious scientific journals, while the remaining 13% were represented in dissertations and university reports. The study of the selected articles revealed that 20 data quality issues that can arise in the input data have been investigated in the literature. These challenges have been categorized into five levels: trace, event, case, activity, and timestamps, and four foundational approaches have been identified that have been used to evaluate and resolve data quality challenges in the mining process. 1) data quality frameworks 2) preprocessing 3) anomaly detection 4) repair. Our findings indicate that preprocessing techniques that seek to remove chaotic and infrequent behaviors from the event log have received more attention than other techniques. In addition, these results demonstrate that, in recent years, the discovery of anomalies and the reconstruction of missing events have become popular research topics within the field of process mining. Examining studies related to the field of data quality in the data mining process reveals an abundance of approaches and methods for addressing data quality challenges. Investigations revealed that the use of colorful Petri nets as a mathematical method has been considered in all selected research projects.
Conclusions: The data needed for process mining methods can be obtained from various sources. One of the major advantages of process mining is that it is not limited to a specific type of system. Any workflow-based system, such as ticketing, resource management, databases, data warehouses, legacy systems, and even manually collected data, can be analyzed as long as it can be separated using case ID, activity, and timestamp attributes. In real-world scenarios, most data is not collected for process mining purposes or is unsuitable for use in process mining analyses. Especially data that is recorded manually or scattered among various isolated systems can contain errors. Despite the efforts made to improve the quality of input data in the mining process, it is still necessary to develop efficient frameworks and methods to identify, evaluate, and address data quality challenges in real business processes, which are often characterized by high volume and complexity. The results of this research can offer a fresh perspective for researchers, data science specialists, and business analysts.
 

Keywords

Main Subjects


Alharbi, A., Bulpitt, A. & Johnson, O. (2017). Improving pattern detection in healthcare process mining using an interval-based event selection method. Business Process Management Forum: BPM Forum 2017, Barcelona, Spain, September 10-15. Proceedings 15.
Al-Hashedi, K.G. & Magalingam, P. (2021). Financial fraud detection applying data mining techniques: a comprehensive review from 2009 to 2019. Computer Science Review, 40: 100402. https://doi.org/10.1016/j.cosrev.2021.100402
Alizadeh, F. & Hadavinejad, M. (2019). Process Mining for Anti-Elitism in an Organization based on Interpretive Mapping Design of Grounded Theory. Organizational Resources Management Researchs, 9(1): 165-183. [in persian]
Andrews, R., Emamjome, F., ter Hofstede, A.H. & Reijers, H.A. (2020). An expert lens on data quality in process mining. 2020 2nd International Conference on Process Mining (ICPM).
Andrews, R., Suriadi, S., Ouyang, C. & Poppe, E. (2018). Towards event log querying for data quality: Let’s start with detecting log imperfections. On the Move to Meaningful Internet Systems. OTM 2018 Conferences: Confederated International Conferences: CoopIS, C&TC, and ODBASE 2018, Valletta, Malta, October 22-26. Proceedings, Part I.
Andrews, R., van Dun, C.G., Wynn, M.T., Kratsch, W., Röglinger, M. & ter Hofstede, A.H. (2020). Quality-informed semi-automated event log generation for process mining. Decision Support Systems, 132: 113265. https://doi.org/10.1016/j.dss.2020.113265
Andrews, R., Wynn, M.T., Vallmuur, K., Ter Hofstede, A.H., Bosley, E., Elcock, M. & Rashford, S. (2019). Leveraging data quality to better prepare for process mining: an approach illustrated through analysing road trauma pre-hospital retrieval and transport processes in Queensland. International journal of environmental research and public health, 16(7): 1138.    
https://doi.org/10.3390/ijerph16071138
Ayo, F.E., Folorunso, O. & Ibharalu, F.T. (2017). A probabilistic approach to event log completeness. Expert Systems with Applications, 80: 263-272.
https://doi.org/10.1016/j.eswa.2017.03.039
Baier, T. (2015). Matching events and activities: preprocessing event logs for process analysis. Doctoral dissertation. Universität Potsdam. Retrieved from:
https://publishup.uni-potsdam.de/frontdoor/index/index/docId/8454
Bayomie, D., Awad, A. & Ezat, E. (2016). Correlating unlabeled events from cyclic business processes execution. International Conference on Advanced Information Systems Engineering.
Bernard, G. & Andritsos, P. (2020). Truncated trace classifier. removal of incomplete traces from event logs. Enterprise, Business-Process and Information Systems Modeling: 21st International Conference, BPMDS 2020, 25th International Conference, EMMSAD 2020, Held at CAiSE 2020, Grenoble, France, June 8–9, Proceedings 21.
Bezerra, F. & Wainer, J. (2013). Algorithms for anomaly detection of traces in logs of process aware information systems. Information systems, 38(1): 33-44.   
https://doi.org/10.1016/j.is.2012.04.004
Bezerra, F., Wainer, J. & van der Aalst, W.M. (2009). Anomaly detection using process mining. Enterprise, Business-Process and Information Systems Modeling: 10th International Workshop, BPMDS 2009, and 14th International Conference, EMMSAD 2009, held at CAiSE 2009, Amsterdam, The Netherlands, June 8-9. Proceedings.
Böhmer, K. & Rinderle-Ma, S. (2016). Multi-perspective anomaly detection in business process execution events. On the Move to Meaningful Internet Systems: OTM 2016 Conferences: Confederated International Conferences: CoopIS, C &TC, and ODBASE 2016, Rhodes, Greece, October 24-28, Proceedings.
Bose, J.C. (2012). Process mining in the large: Preprocessing, Discovery and Diagnostics Eindhoven University of Technology. Eindhoven.
Bose, R.P.J.C., Mans, R.S. & van der Aalst, W.M. (2013). Wanna improve process mining results?: it’s high time we consider data quality issues seriously. BPM reports, 1302.
Burattin, A. (2015). Process mining techniques in business environments. Springer Cham. https://doi.org/10.1007/978-3-319-17482-2
Cappiello, C., Comuzzi, M., Plebani, P. & Fim, M. (2022). Assessing and improving measurability of process performance indicators based on quality of logs. Information systems, 103: 101874. https://doi.org/10.1016/j.is.2021.101874
Chapela-Campa, D., Mucientes, M. & Lama, M. (2019). Simplification of complex process models by abstracting infrequent behaviour. Service-Oriented Computing: 17th International Conference, ICSOC 2019, Toulouse, France, October 28–31, 2019, Proceedings 17.
Chen, L., Kang, S., Karimidorabati, S. & Haas, C. (2019). Improving the quality of event logs in the construction industry for process mining. ISARC. Proceedings of the International Symposium on Automation and Robotics in Construction.
Chen, Q., Lu, Y., Tam, C. & Poon, S. (2021). A Novel Approach to Detect Redundant Activity Labels for More Representative Event Logs. arXiv preprint arXiv: 2103.16061.
Cheng, H.-J. & Kumar, A. (2015). Process mining on noisy logs - Can log sanitization help to improve performance? Decision Support Systems, 79: 138-149.         
https://doi.org/10.1016/j.dss.2015.08.003
Commission, I.O.F.S.I.E. (2008). Software engineering-Software product Quality Requirements and Evaluation (SQuaRe) Data quality model. ISO/IEC, 25012: 1-13.
Conforti, R., La Rosa, M. & ter Hofstede, A.H. (2016). Filtering out infrequent behavior from business process event logs. IEEE Transactions on Knowledge and Data Engineering, 29(2): 300-314. https://doi.org/10.1109/TKDE.2016.2614680
Conforti, R., La Rosa, M., Ter Hofstede, A.H. & Augusto, A. (2020). Automatic repair of same-timestamp errors in business process event logs. Business Process Management: 18th International Conference, BPM 2020, Seville, Spain, September 13–18, 2020, Proceedings 18.
Conforti, R., Rosa, M.L. & Hofstede, A.H.M.T. (2018). Timestamp Repair for Business Process Event Logs. University of Melbourne: Melbourne, Australia.
De Leoni, M. & Felix, M. (2015). Road Traffic Fine Management Process. 4TU.ResearchData. Dataset. https://doi.org/10.4121/uuid:270fd440-1057-4fb9-89a9-b699b47990f5
Denisov,V., Fahland, D. & van der Aalst, W.M. (2020). Repairing event logs with missing events to support performance analysis of systems with shared resources. Application and Theory of Petri Nets and Concurrency: 41st International Conference, PETRI NETS 2020, Paris, France, June 24–25, 2020, Proceedings 41.
Díaz-Rodriguez, O.E. & Hernández, M.G.P. (2020). Quality event log to intention mining: a study case. In: 2020 International Conference on Computer Science, Engineering and Applications (ICCSEA).
Dixit, P.M., Suriadi, S., Andrews, R., Wynn, M.T., ter Hofstede, A.H., Buijs, J.C. & van der Aalst, W.M. (2018). Detection and interactive repair of event ordering imperfection in process logs. Advanced Information Systems Engineering: 30th International Conference, CAiSE 2018, Tallinn, Estonia, June 11-15, 2018, Proceedings 30.
Dumas, M., La Rosa, M., Mendling, J. & Reijers, H.A. (2018). Fundamentals of Business Process Management. (2 ed.). Springer Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-56509-4
Emamjome, F., Andrews, R., ter Hofstede, A. & Reijers, H. (2020). Alohomora: Unlocking data quality causes through event log context. Proceedings of the 28th European Conference on Information Systems (ECIS2020).
Erdogan, T.G. & Tarhan, A. (2018). Systematic mapping of process mining studies in healthcare. IEEE Access, 6: 24543-24567. https://doi.org/10.1109/ACCESS.2018.2831244
Fan, W. (2012). Data Quality: Theory and Practice. Web-Age Information Management 13th International Conference, WAIM 2012, Harbin, China.
Ferreira, D.R. & Gillblad, D. (2009). Discovering process models from unlabelled event logs. Business Process Management: 7th International Conference, BPM 2009, Ulm, Germany, September 8-10, 2009. Proceedings 7.
Fischer, D.A., Goel, K., Andrews, R., van Dun, C.G.J., Wynn, M.T. & Röglinger, M. (2020). Enhancing event log quality: Detecting and quantifying timestamp imperfections. Business Process Management: 18th International Conference, BPM 2020, Seville, Spain, September
13–18, 2020, Proceedings 18.
Fox, F., Aggarwal, V.R., Whelton, H. & Johnson, O. (2018). A data quality framework for process mining of electronic health record data. 2018 IEEE International Conference on Healthcare Informatics (ICHI).
Gao, Y., Song, S., Zhu, X., Wang, J., Lian, X. & Zou, L. (2018). Matching heterogeneous event data. IEEE Transactions on Knowledge and Data Engineering, 30(11): 2157-2170.
https://doi.org/10.1109/TKDE.2018.2815695
Ghionna, L., Greco, G., Guzzo, A. & Pontieri, L. (2008). Outlier detection techniques for process mining applications. Foundations of Intelligent Systems: 17th International Symposium, ISMIS 2008 Toronto, Canada, May 20-23, 2008 Proceedings 17.
Helal, I. & Awad, A. (2020). Correlating Unlabeled Events at Runtime. arXiv preprint arXiv, 2004.09971. https://doi.org/10.48550/arXiv.2004.09971
Horita, H., Kurihashi, Y. & Miyamori, N. (2020). Extraction of missing tendency using decision tree learning in business process event log. Data, 5(3): 82. https://doi.org/10.3390/data5030082
Hosseini, S., Mosleh, A. & Hosseini, M. (2018). Analyzing Organizational Processes by using Process Mining Technique (The Case of Academic Staff's Grade Process of Persian Gulf University). Industrial Management Perspective, 8(1): 113-135. [in persian]
Huang, Y., Wang, Y. & Huang, Y. (2018). Filtering Out Infrequent Events by Expectation from Business Process Event Logs. 2018 14th International Conference on Computational Intelligence and Security (CIS).
Huang, Y., Zhong, L. & Chen, Y. (2020). Filtering Infrequent Behavior in Business Process Discovery by Using the Minimum Expectation. International Journal of Cognitive Informatics and Natural Intelligence (IJCINI), 14(2): 1-15.
Huo, S., Völzer, H., Reddy, P., Agarwal, P., Isahagian, V. & Muthusamy, V. (2021). Graph autoencoders for business process anomaly detection. Business Process Management: 19th International Conference, BPM 2021, Rome, Italy, September 06–10, 2021, Proceedings 19.
Ingvaldsen, J.E. & Gulla, J.A. (2007). Preprocessing support for large scale process mining of SAP transactions. International Conference on Business process management.
Junior, S.B., Ceravolo, P., Damiani, E., Omori, N.J. & Tavares, G.M. (2020). Anomaly detection on event logs with a scarcity of labels. 2020 2nd International Conference on Process Mining (ICPM).
Khadivar, A., Frouzi, F. & Niakan, L. (2020). Risk Analysis and Compliancechecking of Business Rules in Insurance by using of Process Mining Technique. Iranian Journal of Insurance Research (IJIR), 9(2): 33-65. https://doi.org/10.22056/ijir.2020.02.02 [in persian]
Kherbouche, M.O., Laga, N. & Masse, P.-A. (2016). Towards a better assessment of event logs quality. 2016 IEEE Symposium Series on Computational Intelligence (SSCI).
Khojasteh, F., Kahani, M. & Behkamal, B. (2021). Concept Drift Detection in Business Process Logs using Deep Learning. Signal and Data Processing, 17(4): 33-48.      
http://dx.doi.org/10.29252/jsdp.17.4.33 [in persian]
Khoshkhoy Nilash, E., Tamjid Yamechlo, A. & Rad, R. (2021). Performance Analysis and Improvement of Bank of Industry and Mine Working Capital Facility Processes Based on Process Mining Approach. Business Intelligence Management Studies, 9(36): 37-71.      
https://doi.org/10.22054/ims.2021.58106.1896 [in persian]
Ko, J. & Comuzzi, M. (2020). Online anomaly detection using statistical leverage for streaming business process events. International Conference on Process Mining.
Ko, J. & Comuzzi, M. (2021). Online anomaly detection using statistical leverage for streaming business process events. Process Mining Workshops: ICPM 2020 International Workshops, Padua, Italy, October 5–8, 2020, Revised Selected Papers 2.
Kong, L., Li, C., Ge, J., Li, Z., Zhang, F. & Luo, B. (2019). An Efficient Heuristic Method for Repairing Event Logs Independent of Process Models. IoTBDS.
Krajsic, P. & Franczyk, B. (2020). Lambda Architecture for Anomaly Detection in Online Process Mining Using Autoencoders. International Conference on Computational Collective Intelligence.
Krajsic, P. & Franczyk, B. (2021). Variational Autoencoder for Anomaly Detection in Event Data in Online Process Mining. ICEIS.
Kurniati, A.P., Rojas, E., Hogg, D., Hall, G. & Johnson, O.A. (2019). The assessment of data quality issues for process mining in healthcare using Medical Information Mart for Intensive Care III, a freely available e-health record database. Health informatics journal, 25(4): 1878-1893. https://doi.org/10.1177/1460458218810760
Laranjeiro, N., Soydemir, S.N. & Bernardino, J. (2015). A survey on data quality: classifying poor data. 2015 IEEE 21st Pacific rim international symposium on dependable computing (PRDC).
Leemans, S.J., Fahland, D. & van der Aalst, W.M. (2013). Discovering block-structured process models from event logs containing infrequent behaviour. International conference on business process management.
Li, F. (2020). Leading digital transformation: three emerging approaches for managing the transition. International Journal of Operations & Production Management, 40(6): 809-817. https://doi.org/10.1108/IJOPM-04-2020-0202
Liu, J., Xu, J., Zhang, R. & Reiff-Marganiec, S. (2021). A repairing missing activities approach with succession relation for event logs. Knowledge and Information Systems, 63(2): 477-495. https://doi.org/10.1007/s10115-020-01524-6
Lu, K., Fang, X., Fang, N. & Asare, E. (2021). Discovery of effective infrequent sequences based on maximum probability path. Connection Science, 34(1): 63-82.         
https://doi.org/10.1080/09540091.2021.1951667
Lu, X. & Fahland, D. (2017). A Conceptual Framework for Understanding Event Data Quality for Behavior Analysis. 9th Central European Workshop on Services and their Composition Zeus Workshop 2017, 13-14 februari 2017, Lugano, Switserland, Switzerland.
Lu, X. (2016). Handling Duplicated Tasks in Process Discovery by Refining Event Labels. Business Process Management: 14th International Conference, BPM 2016, Rio de Janeiro, Brazil, September 18-22. Proceedings 14. Springer International Publishing.  
https://doi.org/10.4121/uuid:ea90c4be-64b6-4f4b-b27c-10ede28da6b6
Lu, X., Fahland, D. & van der Aalst, W.M. (b2016). Interactively Exploring Logs and Mining Models with Clustering, Filtering, and Relabeling. Proceedings of the BPM 2016 Tool Demonstration Track.
Lu, X., Fahland, D., Andrews, R., Suriadi, S., Wynn, M.T., ter Hofstede, A.H. & van der Aalst, W.M. (2017). Semi-supervised log pattern detection and exploration using event concurrence and contextual information. On the Move to Meaningful Internet Systems. OTM 2017 Conferences: Confederated International Conferences: CoopIS, C&TC, and ODBASE 2017, Rhodes, Greece, October 23-27, 2017, Proceedings, Part I.
Lu, X., Fahland, D., van den Biggelaar, F.J. & van der Aalst, W.M. (a2016). Handling duplicated tasks in process discovery by refining event labels. Business Process Management:
14th International Conference, BPM 2016, Rio de Janeiro, Brazil, September 18-22, 2016. Proceedings 14.
Ly, L.T., Indiono, C., Mangler, J. & Rinderle-Ma, S. (2012). Data transformation and semantic log purging for process mining. International Conference on Advanced Information Systems Engineering.
Mannhardt, F. (2016). Sepsis Cases - Event Log. 4TU. ResearchData. Dataset.      
https://doi.org/10.4121/uuid:915d2bfb-7e84-49ad-a286-dc35f063a460
Mannhardt, F. (2017). Hospital Billing - Event Log. 4TU.ResearchData. Dataset.   
https://doi.org/10.4121/uuid:76c46b83-c930-4798-a1c9-4be94dfeb741
Martin, N. (2018). Using indoor location system data to enhance the quality of healthcare event logs: opportunities and challenges. International Conference on Business Process Management.
Martin, N. (2019). Using indoor location system data to enhance the quality of healthcare event logs: opportunities and challenges. Business Process Management Workshops: BPM 2018 International Workshops, Sydney, NSW, Australia, September 9-14, 2018, Revised Papers 16.
Martin, N. (2021). Data Quality in Process Mining. Springer. 
https://doi.org/10.1007/978-3-030-53993-1_5
Mingers, J. & Willcocks, L. (2014). An integrative semiotic framework for information systems: The social, personal and material worlds. Information and Organization, 24(1): 48-70. https://doi.org/10.1016/j.infoandorg.2014.01.002
Mostafaee Dolatabad, K., Azar, A., Moghbel Baarz, A. & Parvizian, K. (2019). Mining Process Evaluation in Discovering the Semi-Automatic Processes of the Banking Industry (the case: Bank guarantee issuance process). Industrial Management Studies, 17(52): 1-37. [in persian]
Mueller-Wickop, N. & Schultz, M. (2013). ERP event log preprocessing: timestamps vs. accounting logic. Design Science at the Intersection of Physical and Virtual Design: 8th International Conference, DESRIST 2013, Helsinki, Finland, June 11-12, 2013. Proceedings 8.
Ngai, E.W., Hu, Y., Wong, Y.H., Chen, Y. & Sun, X. (2011). The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decision Support Systems, 50(3): 559-569. https://doi.org/10.1016/j.dss.2010.08.006
Nguyen, H.T.C. & Comuzzi, M. (2018). Event Log Reconstruction Using Autoencoders. International Conference on Service-Oriented Computing.
Nguyen, H.T.C. (2018). Neural Computing for Event Log Quality Improvement Graduate School of UNIST]. URL= http://unist.dcollection.net/common/orgView/200000011233
Nguyen, H.T.C., Lee, S., Kim, J., Ko, J. & Comuzzi, M. (2019). Autoencoders for improving quality of process event logs. Expert Systems with Applications, 131: 132-147.
https://doi.org/10.1016/j.eswa.2019.04.052
Nolle, T., Luettgen, S., Seeliger, A. & Mühlhäuser, M. (2018). Analyzing business process anomalies using autoencoders. Machine Learning, 107(11): 1875-1893.         
https://doi.org/10.1007/s10994-018-5702-8
Nolle, T., Luettgen, S., Seeliger, A. & Mühlhäuser, M. (2019). Binet: Multi-perspective business process anomaly classification. Information Systems, 103: 101458.           
https://doi.org/10.1016/j.is.2019.101458
Nolle, T., Seeliger, A. & Mühlhäuser, M. (2016). Unsupervised anomaly detection in noisy business process event logs using denoising autoencoders. Discovery Science: 19th International Conference, DS 2016, Bari, Italy, October 19–21, 2016, Proceedings 19.
Pauwels, S. & Calders, T. (2019). Detecting anomalies in hybrid business process logs. ACM SIGAPP Applied Computing Review, 19(2): 18-30. https://doi.org/10.1145/3357385.3357387
Petrak, L. & Lorenz, R. (2020). Detecting Infrequent Behavior in Event Logs using Statistical Inference. ATAED@ Petri Nets.
Ramos-Gutiérrez, B., Varela-Vaca, Á.J., Ortega, F.J., Gómez-López, M.T. & Wynn, M. T. (2021). A NLP-oriented methodology to enhance event log quality. Enterprise, Business-Process and Information Systems Modeling: 22nd International Conference, BPMDS 2021, and 26th International Conference, EMMSAD 2021, Held at CAiSE 2021, Melbourne, VIC, Australia, June 28–29, 2021, Proceedings.
Redman, T.C. (1998). The impact of poor data quality on the typical enterprise. Communications of the ACM, 41(2): 79-82. https://doi.org/10.1145/269012.269025
Reinkemeyer, L. (2020). Process Mining in Action, Principles, Use Cases and Outlook. Springer. https://doi.org/10.1007/978-3-030-40172-6
Rogge-Solti, A., Mans, R.S., van der Aalst, W.M. & Weske, M. (2013a). Improving documentation by repairing event logs. IFIP Working Conference on The Practice of Enterprise Modeling.
Rogge-Solti, A., Mans, R.S., van der Aalst, W.M. & Weske, M. (2013b). Repairing event logs using stochastic process models. On the Move to Meaningful Internet Systems: OTM 2013 Workshops.
Sadeghianasl, S., ter Hofstede, A.H., Suriadi, S. & Turkay, S. (2020). Collaborative and interactive detection and repair of activity labels in process event logs. 2020 2nd International Conference on Process Mining (ICPM).
Sadeghianasl, S., ter Hofstede, A.H., Wynn, M.T. & Suriadi, S. (2019). A contextual approach to detecting synonymous and polluted activity labels in process event logs. On the Move to Meaningful Internet Systems: OTM 2019 Conferences: Confederated International Conferences: CoopIS, ODBASE, C&TC 2019, Rhodes, Greece, October 21–25, 2019, Proceedings.
Sani, M.F. (2020). Preprocessing Event Data in Process Mining. International Conference on Advanced Information Systems Engineering.
Sani, M.F., Boltenhagen, M. & van der Aalst, W. (2019). Prototype selection based on clustering and conformance metrics for model discovery. arXiv preprint arXiv: 1912.00736.   
https://doi.org/10.48550/arXiv.1912.00736
Sani, M.F., van Zelst, S.J. & van der Aalst, W.M. (2017). Improving process discovery results by filtering outliers using conditional behavioural probabilities. International Conference on Business Process Management.
Sani, M.F., van Zelst, S.J. & van der Aalst, W.M. (2018a). Applying sequence mining for outlier detection in process mining. OTM Confederated International Conferences on the Move to Meaningful Internet Systems.
Sani, M.F., van Zelst, S.J. & van der Aalst, W.M. (2018b). Repairing outlier behaviour in event logs. International Conference on Business Information Systems.
Sani, M.F., van Zelst, S.J. & van der Aalst, W.M. (2019). Repairing outlier behaviour in event logs using contextual behaviour. Enterprise Modelling and Information Systems Architectures (EMISAJ), 14(5): 1-24.
Shami Zanjani, M., Nabibi, F. & Irandoust, S. (2020). Digital Leadership, a Guide to the Transformation of Organizations in the Digital Age. Tehran: Aryanaghalam Publication.
[in persian]
Sim, S., Bae, H. & Choi, Y. (2019). Likelihood-based multiple imputation by event chain methodology for repair of imperfect event logs with missing data. 2019 International Conference on Process Mining (ICPM).
Song, S., Cao, Y. & Wang, J. (2016). Cleaning timestamps with temporal constraints. Proceedings of the VLDB Endowment, 9(10): 708-719. https://doi.org/10.14778/2977797.2977798
Song, S., Gao, Y., Wang, C., Zhu, X., Wang, J. & Philip, S.Y. (2017). Matching heterogeneous events with patterns. IEEE Transactions on Knowledge and Data Engineering, 29(8): 1695-1708. https://doi.org/10.1109/TKDE.2017.2690912
Song, W., Jacobsen, H.-A. & Zhang, P. (2019). Self-Healing Event Logs. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2019.2956520
Song, W., Xia, X., Jacobsen, H.-A., Zhang, P. & Hu, H. (2015). Heuristic recovery of missing events in process logs. 2015 IEEE International Conference on Web Services.
Steeman, W. (2013). BPI Challenge 2013, incidents. 4TU. ResearchData. Dataset.
https://doi.org/10.4121/uuid:500573e6-accc-4b0c-9576-aa5468b10cee
Sun, X., Hou, W., Yu, D., Wang, J. & Pan, J. (2019). Filtering out noise logs for process modelling based on event dependency. 2019 IEEE International Conference on Web Services (ICWS).
Suriadi, S., Andrews, R., ter Hofstede, A.H. & Wynn, M.T. (2017). Event log imperfection patterns for process mining: Towards a systematic approach to cleaning event logs. Information systems, 64: 132-150. https://doi.org/10.1016/j.is.2016.07.011
Tax, N. (2019). Mining insights from weakly-structured event data. arXiv preprint arXiv: 1909.01421. https://doi.org/10.48550/arXiv.1909.01421
Tax, N., Alasgarov, E., Sidorova, N., Haakma, R. & van der Aalst, W.M. (a2019). Generating time-based label refinements to discover more precise process models. Journal of Ambient Intelligence and Smart Environments, 11(2): 165-182. https://doi.org/10.3233/AIS-190519
Tax, N., Sidorova, N. & van der Aalst, W.M. (b2019). Discovering more precise process models from event logs by filtering out chaotic activities. Journal of Intelligent Information Systems, 52(1): 107-139. https://doi.org/10.1007/s10844-018-0507-6
van Cruchten, R. (2019). Data Quality in Process Mining: A Rule-based Approach. Interactive Process Mining in Healthcare: 53-79. Springer, Cham.  
http://dx.doi.org/10.1007/978-3-030-53993-1-5
Van Der Aalst, W. (2011). Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer. https://doi.org/10.1007/978-3-642-19345-3
Van der Aalst, W. (2015). Process Mining: Discovery, Conformance and Enhancement of Business Process. Translated by: S.H. Siadat & R. Hemmati Goshtasb. Tehran: Shahid Beheshti Uni. Press. [in persian]
Van Der Aalst, W. (2016). Process Mining, Data science in action. Springer.        
https://doi.org/10.1007/978-3-662-49851-4
Van Der Aalst, W., Adriansyah, A., De Medeiros, A.K.A., Arcieri, F., Baier, T., Blickle, T., Bose, J.C., Van Den Brand, P., Brandtjen, R. & Buijs, J. (2011). Process mining manifesto. International conference on business process management.
Van der Aalst, W., Weijters, T. & Maruster, L. (2004). Workflow mining: Discovering process models from event logs. IEEE Transactions on Knowledge and Data Engineering, 16(9): 1128-1142. https://doi.org/10.1109/TKDE.2004.47
van der Aalst, W.M., Bolt, A. & van Zelst, S.J. (2017). RapidProM: mine your processes and not just your data. arXiv preprint arXiv: 1703.03740. https://doi.org/10.48550/arXiv.1703.03740
van Dongen, B. (2011). Hospital log (BPIC 2011). 4TU.ResearchData. Dataset.    
https://doi.org/10.4121/uuid:d9769f3d-0ab0-4fb8-803b-0d1120ffcf54
Van Dongen, B. (2012). BPI Challenge 2012, Event log of a loan application process. 4TU. ResearchData. Dataset.       
https://doi.org/10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f
van Dongen, B. (2015). BPI Challenge 2015. 4TU. ResearchData. Dataset.          
https://doi.org/10.4121/uuid:31a308ef-c844-48da-948c-305d167a0ec1
van Dongen, B. (2017). BPI Challenge 2017. 4TU. ResearchData. Dataset.          
https://doi.org/10.4121/uuid:5f3067df-f10b-45da-b98b-86ae4c7a310b
van Dongen, B. (2018). BPI Challenge 2018. 4TU. ResearchData. Dataset.          
https://data.4tu.nl/articles/dataset/BPI_Challenge_2018/12688355
van Dongen, B. (2019). BPI Challenge 2019. 4TU. ResearchData. Dataset.          
https://doi.org/10.4121/uuid:d06aff4b-79f0-45e6-8ec8-e19730c248f1
van Scheepstal, S. (2016). Data quality within process mining in the auditing context Tilburg University. Netherlands.
van Zelst, S.J., Sani, M.F., Ostovar, A., Conforti, R. & La Rosa, M. (2018). Filtering spurious events from event streams of business processes. International Conference on Advanced Information Systems Engineering.
van Zelst, S.J., Sani, M.F., Ostovar, A., Conforti, R. & La Rosa, M. (2020). Detection and removal of infrequent behavior from event streams of business processes. Information systems, 90: 101451. https://doi.org/10.1016/j.is.2019.101451
Vanbrabant, L., Martin, N., Ramaekers, K. & Braekers, K. (2019). Quality of input data in emergency department simulations: framework and assessment techniques. Simulation Modelling Practice and Theory, 91: 83-101. https://doi.org/10.1016/j.simpat.2018.12.002
Verbeek, H., Buijs, J., Van Dongen, B. & van der Aalst, W.M. (2010). Prom 6: The process mining toolkit. Proc. of BPM Demonstration Track, 615: 34-39.
Verhulst, R. (2016). Evaluating quality of event data within event logs: an extensible framework. Master’s thesis, Eindhoven University of Technology].
Vidgof, M., Djurica, D., Bala, S. & Mendling, J. (2020). Cherry-picking from spaghetti: Multi-range filtering of event logs. Enterprise, Business-Process and Information Systems Modeling: 21st International Conference, BPMDS 2020, 25th International Conference, EMMSAD 2020, Held at CAiSE 2020, Grenoble, France, June 8–9, 2020, Proceedings 21.
Vijayakamal, M. & Vasumathi, D. (2019). Unsupervised Learning Methods for Anomaly Detection and Log Quality Improvement Using Process Event Log. International Journal of Advanced Science and Technology, 29: 1109-1125.
Walicki, M. & Ferreira, D.R. (2011). Sequence partitioning for process mining with unlabeled event logs. Data & Knowledge Engineering, 70(10): 821-841.      
https://doi.org/10.1016/j.datak.2011.05.003
Wang, J., Song, S., Lin, X., Zhu, X. & Pei, J. (2015). Cleaning structured event logs: A graph repair approach. 2015 IEEE 31st International Conference on Data Engineering.
Wang, J., Song, S., Zhu, X. & Lin, X. (2013). Efficient recovery of missing events. Proceedings of the VLDB Endowment, 6(10): 841-852. https://doi.org/10.14778/2536206.2536212
Wang, J., Song, S., Zhu, X., Lin, X. & Sun, J. (2016). Efficient recovery of missing events. IEEE Transactions on Knowledge and Data Engineering, 28(11): 2943-2957.
Wang, L.-l., Fang, X.-W., Asare, E. & Huan, F. (2021). An Optimization Approach for Mining of Process Models with Infrequent Behaviors Integrating Data Flow and Control Flow. Scientific Programming, 2021. https://doi.org/10.1155/2021/8874316
Wynn, M.T. & Sadiq, S. (2019). Responsible process mining-a data quality perspective. Business Process Management: 17th International Conference, BPM 2019, Vienna, Austria, September 1–6, 2019, Proceedings 17.
Xia, X., Song, W., Chen, F., Li, X. & Zhang, P. (2016). Effa: A ProM plugin for recovering event logs. Proceedings of the 8th Asia-Pacific Symposium on Internetware.
Xu, J. & Liu, J. (2019). A profile clustering-based event logs repairing approach for process mining. IEEE Access, 7: 17872-17881. https://doi.org/10.1109/ACCESS.2019.2894905
Yi, G. & Peng, Z. (2019). Novel Approach to Discover Precise Process Model by Filtering out Log Chaotic Activities. Journal of Computers, 30(4): 140-150.
Zerbino, P., Stefanini, A. & Aloini, D. (2021). Process science in action: A literature review on process mining in business management. Technological Forecasting and Social Change, 172: 121021. https://doi.org/10.1016/j.techfore.2021.121021
Zhu, X., Song, S., Lian, X., Wang, J. & Zou, L. (2014). Matching heterogeneous event data. Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data.
Zhu, X., Song, S., Wang, J., Philip, S.Y. & Sun, J. (2014). Matching heterogeneous events with patterns. 2014 IEEE 30th International Conference on Data Engineering.
CAPTCHA Image