Understanding Contrail Business Processes through Hierarchical Clustering: A Multi-Stage Framework

Real-world business processes are dynamic, with event logs that are generally unstructured and contain heterogeneous business classes. Process mining techniques derive useful knowledge from such logs but translating them into simplified and logical segments is crucial. Complexity is increased when d...

Full description

Bibliographic Details
Main Authors: Zeeshan Tariq, Naveed Khan, Darryl Charles, Sally McClean, Ian McChesney, Paul Taylor
Format: Article
Language:English
Published: MDPI AG 2020-09-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/13/10/244
_version_ 1797552402709086208
author Zeeshan Tariq
Naveed Khan
Darryl Charles
Sally McClean
Ian McChesney
Paul Taylor
author_facet Zeeshan Tariq
Naveed Khan
Darryl Charles
Sally McClean
Ian McChesney
Paul Taylor
author_sort Zeeshan Tariq
collection DOAJ
description Real-world business processes are dynamic, with event logs that are generally unstructured and contain heterogeneous business classes. Process mining techniques derive useful knowledge from such logs but translating them into simplified and logical segments is crucial. Complexity is increased when dealing with business processes with a large number of events with no outcome labels. Techniques such as trace clustering and event clustering, tend to simplify the complex business logs but the resulting clusters are generally not understandable to the business users as the business aspects of the process are not considered while clustering the process log. In this paper, we provided a multi-stage hierarchical framework for business-logic driven clustering of highly variable process logs with extensively large number of events. Firstly, we introduced a term contrail processes for describing the characteristics of such complex real-world business processes and their logs presenting contrail-like models. Secondly, we proposed an algorithm Novel Hierarchical Clustering (NoHiC) to discover business-logic driven clusters from these contrail processes. For clustering, the raw event log is initially decomposed into high-level business classes, and later feature engineering is performed exclusively based on the business-context features, to support the discovery of meaningful business clusters. We used a hybrid approach which combines rule-based mining technique with a novel form of agglomerative hierarchical clustering for the experiments. A case-study of a CRM process of the UK’s renowned telecommunication firm is presented and the quality of the proposed framework is verified through several measures, such as cluster segregation, classification accuracy, and fitness of the log. We compared NoHiC technique with two trace clustering techniques using two real world process logs. The discovered clusters through NoHiC are found to have improved fitness as compared to the other techniques, and they also hold valuable information about the business context of the process log.
first_indexed 2024-03-10T16:00:31Z
format Article
id doaj.art-2a3761a7afca4b3aa67d141410cb86e9
institution Directory Open Access Journal
issn 1999-4893
language English
last_indexed 2024-03-10T16:00:31Z
publishDate 2020-09-01
publisher MDPI AG
record_format Article
series Algorithms
spelling doaj.art-2a3761a7afca4b3aa67d141410cb86e92023-11-20T15:19:05ZengMDPI AGAlgorithms1999-48932020-09-01131024410.3390/a13100244Understanding Contrail Business Processes through Hierarchical Clustering: A Multi-Stage FrameworkZeeshan Tariq0Naveed Khan1Darryl Charles2Sally McClean3Ian McChesney4Paul Taylor5School of Computing, Ulster University, Newtownabbey BT37 0QB, UKSchool of Computing, Ulster University, Newtownabbey BT37 0QB, UKSchool of Computing, Ulster University, Newtownabbey BT37 0QB, UKSchool of Computing, Ulster University, Newtownabbey BT37 0QB, UKSchool of Computing, Ulster University, Newtownabbey BT37 0QB, UKApplied Research, BT, Ipswich IP1 2AU, UKReal-world business processes are dynamic, with event logs that are generally unstructured and contain heterogeneous business classes. Process mining techniques derive useful knowledge from such logs but translating them into simplified and logical segments is crucial. Complexity is increased when dealing with business processes with a large number of events with no outcome labels. Techniques such as trace clustering and event clustering, tend to simplify the complex business logs but the resulting clusters are generally not understandable to the business users as the business aspects of the process are not considered while clustering the process log. In this paper, we provided a multi-stage hierarchical framework for business-logic driven clustering of highly variable process logs with extensively large number of events. Firstly, we introduced a term contrail processes for describing the characteristics of such complex real-world business processes and their logs presenting contrail-like models. Secondly, we proposed an algorithm Novel Hierarchical Clustering (NoHiC) to discover business-logic driven clusters from these contrail processes. For clustering, the raw event log is initially decomposed into high-level business classes, and later feature engineering is performed exclusively based on the business-context features, to support the discovery of meaningful business clusters. We used a hybrid approach which combines rule-based mining technique with a novel form of agglomerative hierarchical clustering for the experiments. A case-study of a CRM process of the UK’s renowned telecommunication firm is presented and the quality of the proposed framework is verified through several measures, such as cluster segregation, classification accuracy, and fitness of the log. We compared NoHiC technique with two trace clustering techniques using two real world process logs. The discovered clusters through NoHiC are found to have improved fitness as compared to the other techniques, and they also hold valuable information about the business context of the process log.https://www.mdpi.com/1999-4893/13/10/244process miningtrace clusteringmachine learningknowledge discoveryprocess analytics
spellingShingle Zeeshan Tariq
Naveed Khan
Darryl Charles
Sally McClean
Ian McChesney
Paul Taylor
Understanding Contrail Business Processes through Hierarchical Clustering: A Multi-Stage Framework
Algorithms
process mining
trace clustering
machine learning
knowledge discovery
process analytics
title Understanding Contrail Business Processes through Hierarchical Clustering: A Multi-Stage Framework
title_full Understanding Contrail Business Processes through Hierarchical Clustering: A Multi-Stage Framework
title_fullStr Understanding Contrail Business Processes through Hierarchical Clustering: A Multi-Stage Framework
title_full_unstemmed Understanding Contrail Business Processes through Hierarchical Clustering: A Multi-Stage Framework
title_short Understanding Contrail Business Processes through Hierarchical Clustering: A Multi-Stage Framework
title_sort understanding contrail business processes through hierarchical clustering a multi stage framework
topic process mining
trace clustering
machine learning
knowledge discovery
process analytics
url https://www.mdpi.com/1999-4893/13/10/244
work_keys_str_mv AT zeeshantariq understandingcontrailbusinessprocessesthroughhierarchicalclusteringamultistageframework
AT naveedkhan understandingcontrailbusinessprocessesthroughhierarchicalclusteringamultistageframework
AT darrylcharles understandingcontrailbusinessprocessesthroughhierarchicalclusteringamultistageframework
AT sallymcclean understandingcontrailbusinessprocessesthroughhierarchicalclusteringamultistageframework
AT ianmcchesney understandingcontrailbusinessprocessesthroughhierarchicalclusteringamultistageframework
AT paultaylor understandingcontrailbusinessprocessesthroughhierarchicalclusteringamultistageframework