Sequential pattern mining using personalized minimum support threshold with minimum items
One of the challenges of Sequential Pattern Mining is finding frequent sequential patterns in a huge click stream data (web logs) since the data has the issue of a very low support distribution.By applying a Frequent Pattern Discovery technique, a sequence is considered as frequent if it occurs more...
Main Authors: | , , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2011
|
Subjects: | |
Online Access: | https://repo.uum.edu.my/id/eprint/12320/1/06.pdf |
_version_ | 1825802988149538816 |
---|---|
author | Alias, Suraya Razali, Mohd Norhisham Tan, Soo Fun Sainin, Mohd Shamrie |
author_facet | Alias, Suraya Razali, Mohd Norhisham Tan, Soo Fun Sainin, Mohd Shamrie |
author_sort | Alias, Suraya |
collection | UUM |
description | One of the challenges of Sequential Pattern Mining is finding frequent sequential patterns in a huge click stream data (web logs) since the data has the issue of a very low support distribution.By applying a Frequent Pattern Discovery technique, a sequence is considered as frequent if it occurs more than the minimum support (min sup) threshold value.The conventional method of assuming one min sup value is valid for all levels of k-sequence, may have an impact on the overall results or pattern generation. In this paper, a personalized minimum support (P_minsup) threshold with user specified minimum items or min_i is introduced. The P_minsup is generated for each k-sequence by analyzing the overall support pattern distribution of the click stream data; while the min_i value gives the user the flexibility to gain control on the number of patterns to be generated on the next k-sequence by using the top min_i items. This approach is then applied in the SPADE Algorithm using vector array as an extension from the previous method of using relational database and pre-defined threshold.The result from this experiment demonstrates that P_minsup with the complement of min_i value approach is applicable in assisting the process of determining the suitable threshold value to be used in detecting users' frequent k-sequential topics in navigating the World Wide Web (WWW). |
first_indexed | 2024-07-04T05:49:35Z |
format | Conference or Workshop Item |
id | uum-12320 |
institution | Universiti Utara Malaysia |
language | English |
last_indexed | 2024-07-04T05:49:35Z |
publishDate | 2011 |
record_format | eprints |
spelling | uum-123202014-10-21T01:05:34Z https://repo.uum.edu.my/id/eprint/12320/ Sequential pattern mining using personalized minimum support threshold with minimum items Alias, Suraya Razali, Mohd Norhisham Tan, Soo Fun Sainin, Mohd Shamrie QA76 Computer software One of the challenges of Sequential Pattern Mining is finding frequent sequential patterns in a huge click stream data (web logs) since the data has the issue of a very low support distribution.By applying a Frequent Pattern Discovery technique, a sequence is considered as frequent if it occurs more than the minimum support (min sup) threshold value.The conventional method of assuming one min sup value is valid for all levels of k-sequence, may have an impact on the overall results or pattern generation. In this paper, a personalized minimum support (P_minsup) threshold with user specified minimum items or min_i is introduced. The P_minsup is generated for each k-sequence by analyzing the overall support pattern distribution of the click stream data; while the min_i value gives the user the flexibility to gain control on the number of patterns to be generated on the next k-sequence by using the top min_i items. This approach is then applied in the SPADE Algorithm using vector array as an extension from the previous method of using relational database and pre-defined threshold.The result from this experiment demonstrates that P_minsup with the complement of min_i value approach is applicable in assisting the process of determining the suitable threshold value to be used in detecting users' frequent k-sequential topics in navigating the World Wide Web (WWW). 2011 Conference or Workshop Item PeerReviewed application/pdf en https://repo.uum.edu.my/id/eprint/12320/1/06.pdf Alias, Suraya and Razali, Mohd Norhisham and Tan, Soo Fun and Sainin, Mohd Shamrie (2011) Sequential pattern mining using personalized minimum support threshold with minimum items. In: International Conference on Research and Innovation in Information Systems (ICRIIS), 23-24 Nov. 2011, Kuala Lumpur. http://dx.doi.org/10.1109/ICRIIS.2011.6125688 doi:10.1109/ICRIIS.2011.6125688 doi:10.1109/ICRIIS.2011.6125688 |
spellingShingle | QA76 Computer software Alias, Suraya Razali, Mohd Norhisham Tan, Soo Fun Sainin, Mohd Shamrie Sequential pattern mining using personalized minimum support threshold with minimum items |
title | Sequential pattern mining using personalized minimum support threshold with minimum items |
title_full | Sequential pattern mining using personalized minimum support threshold with minimum items |
title_fullStr | Sequential pattern mining using personalized minimum support threshold with minimum items |
title_full_unstemmed | Sequential pattern mining using personalized minimum support threshold with minimum items |
title_short | Sequential pattern mining using personalized minimum support threshold with minimum items |
title_sort | sequential pattern mining using personalized minimum support threshold with minimum items |
topic | QA76 Computer software |
url | https://repo.uum.edu.my/id/eprint/12320/1/06.pdf |
work_keys_str_mv | AT aliassuraya sequentialpatternminingusingpersonalizedminimumsupportthresholdwithminimumitems AT razalimohdnorhisham sequentialpatternminingusingpersonalizedminimumsupportthresholdwithminimumitems AT tansoofun sequentialpatternminingusingpersonalizedminimumsupportthresholdwithminimumitems AT saininmohdshamrie sequentialpatternminingusingpersonalizedminimumsupportthresholdwithminimumitems |