Mining High Utility Time Interval Sequences Using MapReduce Approach: Multiple Utility Framework

Mining high utility sequential patterns is observed to be a significant research in data mining. Several methods mine the sequential patterns while taking utility values into consideration. The patterns of this type can determine the order in which items were purchased, but not the time interval bet...

Full description

Bibliographic Details
Main Authors: Sumalatha Saleti, T. Jaya Lakshmi, Mohd Wazih Ahmad
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9961173/
_version_ 1828181194693410816
author Sumalatha Saleti
T. Jaya Lakshmi
Mohd Wazih Ahmad
author_facet Sumalatha Saleti
T. Jaya Lakshmi
Mohd Wazih Ahmad
author_sort Sumalatha Saleti
collection DOAJ
description Mining high utility sequential patterns is observed to be a significant research in data mining. Several methods mine the sequential patterns while taking utility values into consideration. The patterns of this type can determine the order in which items were purchased, but not the time interval between them. The time interval among items is important for predicting the most useful real-world circumstances, including retail market basket data analysis, stock market fluctuations, DNA sequence analysis, and so on. There are a very few algorithms for mining sequential patterns those consider both the utility and time interval. However, they assume the same threshold for each item, maintaining the same unit profit. Moreover, with the rapid growth in data, the traditional algorithms cannot handle the big data and are not scalable. To handle this problem, we propose a distributed three phase MapReduce framework that considers multiple utilities and suitable for handling big data. The time constraints are pushed into the algorithm instead of pre-defined intervals. Also, the proposed upper bound minimizes the number of candidate patterns during the mining process. The approach has been tested and the experimental results show its efficiency in terms of run time, memory utilization, and scalability.
first_indexed 2024-04-12T05:57:35Z
format Article
id doaj.art-ab135d866fd1402d8b43770f819608ba
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-12T05:57:35Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-ab135d866fd1402d8b43770f819608ba2022-12-22T03:45:07ZengIEEEIEEE Access2169-35362022-01-011012330112331510.1109/ACCESS.2022.32242179961173Mining High Utility Time Interval Sequences Using MapReduce Approach: Multiple Utility FrameworkSumalatha Saleti0https://orcid.org/0000-0003-1368-4993T. Jaya Lakshmi1https://orcid.org/0000-0003-0183-4093Mohd Wazih Ahmad2https://orcid.org/0000-0001-5614-2591Department of Computer Science and Engineering, SRM University AP, Guntur, Amaravati, IndiaDepartment of Computer Science and Engineering, SRM University AP, Guntur, Amaravati, IndiaDepartment of Computer Science and Engineering, Adama Science and Technology University, Adama, EthiopiaMining high utility sequential patterns is observed to be a significant research in data mining. Several methods mine the sequential patterns while taking utility values into consideration. The patterns of this type can determine the order in which items were purchased, but not the time interval between them. The time interval among items is important for predicting the most useful real-world circumstances, including retail market basket data analysis, stock market fluctuations, DNA sequence analysis, and so on. There are a very few algorithms for mining sequential patterns those consider both the utility and time interval. However, they assume the same threshold for each item, maintaining the same unit profit. Moreover, with the rapid growth in data, the traditional algorithms cannot handle the big data and are not scalable. To handle this problem, we propose a distributed three phase MapReduce framework that considers multiple utilities and suitable for handling big data. The time constraints are pushed into the algorithm instead of pre-defined intervals. Also, the proposed upper bound minimizes the number of candidate patterns during the mining process. The approach has been tested and the experimental results show its efficiency in terms of run time, memory utilization, and scalability.https://ieeexplore.ieee.org/document/9961173/Data miningMapReduce frameworkmultiple utility thresholdssequential pattern miningtime interval patterns
spellingShingle Sumalatha Saleti
T. Jaya Lakshmi
Mohd Wazih Ahmad
Mining High Utility Time Interval Sequences Using MapReduce Approach: Multiple Utility Framework
IEEE Access
Data mining
MapReduce framework
multiple utility thresholds
sequential pattern mining
time interval patterns
title Mining High Utility Time Interval Sequences Using MapReduce Approach: Multiple Utility Framework
title_full Mining High Utility Time Interval Sequences Using MapReduce Approach: Multiple Utility Framework
title_fullStr Mining High Utility Time Interval Sequences Using MapReduce Approach: Multiple Utility Framework
title_full_unstemmed Mining High Utility Time Interval Sequences Using MapReduce Approach: Multiple Utility Framework
title_short Mining High Utility Time Interval Sequences Using MapReduce Approach: Multiple Utility Framework
title_sort mining high utility time interval sequences using mapreduce approach multiple utility framework
topic Data mining
MapReduce framework
multiple utility thresholds
sequential pattern mining
time interval patterns
url https://ieeexplore.ieee.org/document/9961173/
work_keys_str_mv AT sumalathasaleti mininghighutilitytimeintervalsequencesusingmapreduceapproachmultipleutilityframework
AT tjayalakshmi mininghighutilitytimeintervalsequencesusingmapreduceapproachmultipleutilityframework
AT mohdwazihahmad mininghighutilitytimeintervalsequencesusingmapreduceapproachmultipleutilityframework