Sequential pattern mining using personalized minimum support threshold with minimum items

One of the challenges of Sequential Pattern Mining is finding frequent sequential patterns in a huge click stream data (web logs) since the data has the issue of a very low support distribution.By applying a Frequent Pattern Discovery technique, a sequence is considered as frequent if it occurs more...

Full description

Bibliographic Details
Main Authors: Alias, Suraya, Razali, Mohd Norhisham, Tan, Soo Fun, Sainin, Mohd Shamrie
Format: Conference or Workshop Item
Language:English
Published: 2011
Subjects:
Online Access:https://repo.uum.edu.my/id/eprint/12320/1/06.pdf
_version_ 1825802988149538816
author Alias, Suraya
Razali, Mohd Norhisham
Tan, Soo Fun
Sainin, Mohd Shamrie
author_facet Alias, Suraya
Razali, Mohd Norhisham
Tan, Soo Fun
Sainin, Mohd Shamrie
author_sort Alias, Suraya
collection UUM
description One of the challenges of Sequential Pattern Mining is finding frequent sequential patterns in a huge click stream data (web logs) since the data has the issue of a very low support distribution.By applying a Frequent Pattern Discovery technique, a sequence is considered as frequent if it occurs more than the minimum support (min sup) threshold value.The conventional method of assuming one min sup value is valid for all levels of k-sequence, may have an impact on the overall results or pattern generation. In this paper, a personalized minimum support (P_minsup) threshold with user specified minimum items or min_i is introduced. The P_minsup is generated for each k-sequence by analyzing the overall support pattern distribution of the click stream data; while the min_i value gives the user the flexibility to gain control on the number of patterns to be generated on the next k-sequence by using the top min_i items. This approach is then applied in the SPADE Algorithm using vector array as an extension from the previous method of using relational database and pre-defined threshold.The result from this experiment demonstrates that P_minsup with the complement of min_i value approach is applicable in assisting the process of determining the suitable threshold value to be used in detecting users' frequent k-sequential topics in navigating the World Wide Web (WWW).
first_indexed 2024-07-04T05:49:35Z
format Conference or Workshop Item
id uum-12320
institution Universiti Utara Malaysia
language English
last_indexed 2024-07-04T05:49:35Z
publishDate 2011
record_format eprints
spelling uum-123202014-10-21T01:05:34Z https://repo.uum.edu.my/id/eprint/12320/ Sequential pattern mining using personalized minimum support threshold with minimum items Alias, Suraya Razali, Mohd Norhisham Tan, Soo Fun Sainin, Mohd Shamrie QA76 Computer software One of the challenges of Sequential Pattern Mining is finding frequent sequential patterns in a huge click stream data (web logs) since the data has the issue of a very low support distribution.By applying a Frequent Pattern Discovery technique, a sequence is considered as frequent if it occurs more than the minimum support (min sup) threshold value.The conventional method of assuming one min sup value is valid for all levels of k-sequence, may have an impact on the overall results or pattern generation. In this paper, a personalized minimum support (P_minsup) threshold with user specified minimum items or min_i is introduced. The P_minsup is generated for each k-sequence by analyzing the overall support pattern distribution of the click stream data; while the min_i value gives the user the flexibility to gain control on the number of patterns to be generated on the next k-sequence by using the top min_i items. This approach is then applied in the SPADE Algorithm using vector array as an extension from the previous method of using relational database and pre-defined threshold.The result from this experiment demonstrates that P_minsup with the complement of min_i value approach is applicable in assisting the process of determining the suitable threshold value to be used in detecting users' frequent k-sequential topics in navigating the World Wide Web (WWW). 2011 Conference or Workshop Item PeerReviewed application/pdf en https://repo.uum.edu.my/id/eprint/12320/1/06.pdf Alias, Suraya and Razali, Mohd Norhisham and Tan, Soo Fun and Sainin, Mohd Shamrie (2011) Sequential pattern mining using personalized minimum support threshold with minimum items. In: International Conference on Research and Innovation in Information Systems (ICRIIS), 23-24 Nov. 2011, Kuala Lumpur. http://dx.doi.org/10.1109/ICRIIS.2011.6125688 doi:10.1109/ICRIIS.2011.6125688 doi:10.1109/ICRIIS.2011.6125688
spellingShingle QA76 Computer software
Alias, Suraya
Razali, Mohd Norhisham
Tan, Soo Fun
Sainin, Mohd Shamrie
Sequential pattern mining using personalized minimum support threshold with minimum items
title Sequential pattern mining using personalized minimum support threshold with minimum items
title_full Sequential pattern mining using personalized minimum support threshold with minimum items
title_fullStr Sequential pattern mining using personalized minimum support threshold with minimum items
title_full_unstemmed Sequential pattern mining using personalized minimum support threshold with minimum items
title_short Sequential pattern mining using personalized minimum support threshold with minimum items
title_sort sequential pattern mining using personalized minimum support threshold with minimum items
topic QA76 Computer software
url https://repo.uum.edu.my/id/eprint/12320/1/06.pdf
work_keys_str_mv AT aliassuraya sequentialpatternminingusingpersonalizedminimumsupportthresholdwithminimumitems
AT razalimohdnorhisham sequentialpatternminingusingpersonalizedminimumsupportthresholdwithminimumitems
AT tansoofun sequentialpatternminingusingpersonalizedminimumsupportthresholdwithminimumitems
AT saininmohdshamrie sequentialpatternminingusingpersonalizedminimumsupportthresholdwithminimumitems