An Efficient Parallel Mining Algorithm Representative Pattern Set of Large-Scale Itemsets in IoT
With the advent of the age of big data, people can collect rich and diverse data from a wide variety of collection devices, such as those of the Internet of Things. Knowledge hidden in large data is very useful and valuable. Frequent pattern mining, as a basic method of data mining, is applied to ev...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2018-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8558539/ |
_version_ | 1818855179226710016 |
---|---|
author | Zhang Tianrui Wei Mingqi Liu Bin |
author_facet | Zhang Tianrui Wei Mingqi Liu Bin |
author_sort | Zhang Tianrui |
collection | DOAJ |
description | With the advent of the age of big data, people can collect rich and diverse data from a wide variety of collection devices, such as those of the Internet of Things. Knowledge hidden in large data is very useful and valuable. Frequent pattern mining, as a basic method of data mining, is applied to every aspect of society. However, the application of traditional frequent pattern mining methods to big data involves bottlenecks due to the large number of result sets. Such bottlenecks make it difficult to produce practical value in production and life. Therefore, mining representative pattern sets has been proposed. However, most existing algorithms select representative patterns after mining frequent pattern sets. This framework can make the runtime difficult to evaluate in large data environments. To solve the above-mentioned problems, this paper presents an online representative pattern-set parallel-mining algorithm. Within the parallel MapReduce framework, this algorithm uses horizontal segmentation to process the database and then applies the online mining algorithm to mine the locally represented pattern sets on each small database. Finally, several performance optimization strategies are proposed. As shown by numerous experiments on the actual dataset, the algorithm proposed in this paper improves the time efficiency by one order of magnitude. Several optimization strategies reduce the execution time to varying degrees. |
first_indexed | 2024-12-19T08:04:29Z |
format | Article |
id | doaj.art-0ff0c44ddaf64f03b844ec0ca70e8073 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-19T08:04:29Z |
publishDate | 2018-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-0ff0c44ddaf64f03b844ec0ca70e80732022-12-21T20:29:46ZengIEEEIEEE Access2169-35362018-01-016791627917310.1109/ACCESS.2018.28848888558539An Efficient Parallel Mining Algorithm Representative Pattern Set of Large-Scale Itemsets in IoTZhang Tianrui0https://orcid.org/0000-0002-6415-6552Wei Mingqi1Liu Bin2Shenyang University, Shenyang, ChinaShenyang University, Shenyang, ChinaShenyang University, Shenyang, ChinaWith the advent of the age of big data, people can collect rich and diverse data from a wide variety of collection devices, such as those of the Internet of Things. Knowledge hidden in large data is very useful and valuable. Frequent pattern mining, as a basic method of data mining, is applied to every aspect of society. However, the application of traditional frequent pattern mining methods to big data involves bottlenecks due to the large number of result sets. Such bottlenecks make it difficult to produce practical value in production and life. Therefore, mining representative pattern sets has been proposed. However, most existing algorithms select representative patterns after mining frequent pattern sets. This framework can make the runtime difficult to evaluate in large data environments. To solve the above-mentioned problems, this paper presents an online representative pattern-set parallel-mining algorithm. Within the parallel MapReduce framework, this algorithm uses horizontal segmentation to process the database and then applies the online mining algorithm to mine the locally represented pattern sets on each small database. Finally, several performance optimization strategies are proposed. As shown by numerous experiments on the actual dataset, the algorithm proposed in this paper improves the time efficiency by one order of magnitude. Several optimization strategies reduce the execution time to varying degrees.https://ieeexplore.ieee.org/document/8558539/Big dataIoT data analysisMapReduceonline parallel miningrepresentative pattern |
spellingShingle | Zhang Tianrui Wei Mingqi Liu Bin An Efficient Parallel Mining Algorithm Representative Pattern Set of Large-Scale Itemsets in IoT IEEE Access Big data IoT data analysis MapReduce online parallel mining representative pattern |
title | An Efficient Parallel Mining Algorithm Representative Pattern Set of Large-Scale Itemsets in IoT |
title_full | An Efficient Parallel Mining Algorithm Representative Pattern Set of Large-Scale Itemsets in IoT |
title_fullStr | An Efficient Parallel Mining Algorithm Representative Pattern Set of Large-Scale Itemsets in IoT |
title_full_unstemmed | An Efficient Parallel Mining Algorithm Representative Pattern Set of Large-Scale Itemsets in IoT |
title_short | An Efficient Parallel Mining Algorithm Representative Pattern Set of Large-Scale Itemsets in IoT |
title_sort | efficient parallel mining algorithm representative pattern set of large scale itemsets in iot |
topic | Big data IoT data analysis MapReduce online parallel mining representative pattern |
url | https://ieeexplore.ieee.org/document/8558539/ |
work_keys_str_mv | AT zhangtianrui anefficientparallelminingalgorithmrepresentativepatternsetoflargescaleitemsetsiniot AT weimingqi anefficientparallelminingalgorithmrepresentativepatternsetoflargescaleitemsetsiniot AT liubin anefficientparallelminingalgorithmrepresentativepatternsetoflargescaleitemsetsiniot AT zhangtianrui efficientparallelminingalgorithmrepresentativepatternsetoflargescaleitemsetsiniot AT weimingqi efficientparallelminingalgorithmrepresentativepatternsetoflargescaleitemsetsiniot AT liubin efficientparallelminingalgorithmrepresentativepatternsetoflargescaleitemsetsiniot |