Discovery of effective infrequent sequences based on maximum probability path
Process discovery usually analyses frequent behaviour in event logs to gain an intuitive understanding of processes. However, there are some effective infrequent behaviours that help to improve business processes in real life. Most existing studies either ignore them or treat them as harmful behavio...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Taylor & Francis Group
2022-12-01
|
Series: | Connection Science |
Subjects: | |
Online Access: | http://dx.doi.org/10.1080/09540091.2021.1951667 |
_version_ | 1797684069525356544 |
---|---|
author | Ke Lu Xianwen Fang Na Fang Esther Asare |
author_facet | Ke Lu Xianwen Fang Na Fang Esther Asare |
author_sort | Ke Lu |
collection | DOAJ |
description | Process discovery usually analyses frequent behaviour in event logs to gain an intuitive understanding of processes. However, there are some effective infrequent behaviours that help to improve business processes in real life. Most existing studies either ignore them or treat them as harmful behaviours. To distinguish effective infrequent sequences from noisy activities, this paper proposes an algorithm to analyse the distribution states of activities and the strong transfer relationships between behaviours based on maximum probability paths. The algorithm divides episodic traces into two categories: harmful and useful episodes, namely noisy activities and effective sequences. First, using conditional probability entropy, the infrequent logs are pre-processed to remove individual noisy activities that are extremely irregularly distributed in the traces. Effective sequences are then extracted from the logs based on the state transfer information of the activities. The algorithm is based on a PM4Py implementation and is validated using synthetic and real logs. From the results, the algorithm not only preserves the key structure of the model and reduces noise activity, but also improves the quality of the model. |
first_indexed | 2024-03-12T00:24:02Z |
format | Article |
id | doaj.art-5cc9c0d9c1e642c3b7bc902d925ca5c8 |
institution | Directory Open Access Journal |
issn | 0954-0091 1360-0494 |
language | English |
last_indexed | 2024-03-12T00:24:02Z |
publishDate | 2022-12-01 |
publisher | Taylor & Francis Group |
record_format | Article |
series | Connection Science |
spelling | doaj.art-5cc9c0d9c1e642c3b7bc902d925ca5c82023-09-15T10:47:59ZengTaylor & Francis GroupConnection Science0954-00911360-04942022-12-01341638210.1080/09540091.2021.19516671951667Discovery of effective infrequent sequences based on maximum probability pathKe Lu0Xianwen Fang1Na Fang2Esther Asare3Anhui University of Science and TechnologyAnhui University of Science and TechnologyAnhui University of Science and TechnologyAnhui University of Science and TechnologyProcess discovery usually analyses frequent behaviour in event logs to gain an intuitive understanding of processes. However, there are some effective infrequent behaviours that help to improve business processes in real life. Most existing studies either ignore them or treat them as harmful behaviours. To distinguish effective infrequent sequences from noisy activities, this paper proposes an algorithm to analyse the distribution states of activities and the strong transfer relationships between behaviours based on maximum probability paths. The algorithm divides episodic traces into two categories: harmful and useful episodes, namely noisy activities and effective sequences. First, using conditional probability entropy, the infrequent logs are pre-processed to remove individual noisy activities that are extremely irregularly distributed in the traces. Effective sequences are then extracted from the logs based on the state transfer information of the activities. The algorithm is based on a PM4Py implementation and is validated using synthetic and real logs. From the results, the algorithm not only preserves the key structure of the model and reduces noise activity, but also improves the quality of the model.http://dx.doi.org/10.1080/09540091.2021.1951667effective infrequent sequencesnoise activitymaximum probability pathconditional probability entropystate transition matrixprocess discovery |
spellingShingle | Ke Lu Xianwen Fang Na Fang Esther Asare Discovery of effective infrequent sequences based on maximum probability path Connection Science effective infrequent sequences noise activity maximum probability path conditional probability entropy state transition matrix process discovery |
title | Discovery of effective infrequent sequences based on maximum probability path |
title_full | Discovery of effective infrequent sequences based on maximum probability path |
title_fullStr | Discovery of effective infrequent sequences based on maximum probability path |
title_full_unstemmed | Discovery of effective infrequent sequences based on maximum probability path |
title_short | Discovery of effective infrequent sequences based on maximum probability path |
title_sort | discovery of effective infrequent sequences based on maximum probability path |
topic | effective infrequent sequences noise activity maximum probability path conditional probability entropy state transition matrix process discovery |
url | http://dx.doi.org/10.1080/09540091.2021.1951667 |
work_keys_str_mv | AT kelu discoveryofeffectiveinfrequentsequencesbasedonmaximumprobabilitypath AT xianwenfang discoveryofeffectiveinfrequentsequencesbasedonmaximumprobabilitypath AT nafang discoveryofeffectiveinfrequentsequencesbasedonmaximumprobabilitypath AT estherasare discoveryofeffectiveinfrequentsequencesbasedonmaximumprobabilitypath |