Parallel Implementation of Apriori Algorithm Based on MapReduce
Searching frequent patterns in transactional databases is considered as one of the most important data mining problems and Apriori is one of the typical algorithms for this task. Developing fast and efficient algorithms that can handle large volumes of data becomes a challenging task due to the larg...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2013-04-01
|
Series: | International Journal of Networked and Distributed Computing (IJNDC) |
Subjects: | |
Online Access: | https://www.atlantis-press.com/article/8360.pdf |
_version_ | 1797934587819589632 |
---|---|
author | Ning Li Li Zeng Qing He Zhongzhi Shi |
author_facet | Ning Li Li Zeng Qing He Zhongzhi Shi |
author_sort | Ning Li |
collection | DOAJ |
description | Searching frequent patterns in transactional databases is considered as one of the most important data mining problems and Apriori is one of the typical algorithms for this task. Developing fast and efficient algorithms that can handle large volumes of data becomes a challenging task due to the large databases. In this paper, we implement a parallel Apriori algorithm based on MapReduce, which is a framework for processing huge datasets on certain kinds of distributable problems using a large number of computers (nodes). The experimental results demonstrate that the proposed algorithm can scale well and efficiently process large datasets on commodity hardware. |
first_indexed | 2024-04-10T18:01:20Z |
format | Article |
id | doaj.art-1f51cbb4a0e64de09c0cea09024386e0 |
institution | Directory Open Access Journal |
issn | 2211-7946 |
language | English |
last_indexed | 2024-04-10T18:01:20Z |
publishDate | 2013-04-01 |
publisher | Springer |
record_format | Article |
series | International Journal of Networked and Distributed Computing (IJNDC) |
spelling | doaj.art-1f51cbb4a0e64de09c0cea09024386e02023-02-02T15:05:25ZengSpringerInternational Journal of Networked and Distributed Computing (IJNDC)2211-79462013-04-011210.2991/ijndc.2013.1.2.3Parallel Implementation of Apriori Algorithm Based on MapReduceNing LiLi ZengQing HeZhongzhi ShiSearching frequent patterns in transactional databases is considered as one of the most important data mining problems and Apriori is one of the typical algorithms for this task. Developing fast and efficient algorithms that can handle large volumes of data becomes a challenging task due to the large databases. In this paper, we implement a parallel Apriori algorithm based on MapReduce, which is a framework for processing huge datasets on certain kinds of distributable problems using a large number of computers (nodes). The experimental results demonstrate that the proposed algorithm can scale well and efficiently process large datasets on commodity hardware.https://www.atlantis-press.com/article/8360.pdfApriori algorithm; Frequent itemsets; MapReduce; Parallel implementation; Large database. |
spellingShingle | Ning Li Li Zeng Qing He Zhongzhi Shi Parallel Implementation of Apriori Algorithm Based on MapReduce International Journal of Networked and Distributed Computing (IJNDC) Apriori algorithm; Frequent itemsets; MapReduce; Parallel implementation; Large database. |
title | Parallel Implementation of Apriori Algorithm Based on MapReduce |
title_full | Parallel Implementation of Apriori Algorithm Based on MapReduce |
title_fullStr | Parallel Implementation of Apriori Algorithm Based on MapReduce |
title_full_unstemmed | Parallel Implementation of Apriori Algorithm Based on MapReduce |
title_short | Parallel Implementation of Apriori Algorithm Based on MapReduce |
title_sort | parallel implementation of apriori algorithm based on mapreduce |
topic | Apriori algorithm; Frequent itemsets; MapReduce; Parallel implementation; Large database. |
url | https://www.atlantis-press.com/article/8360.pdf |
work_keys_str_mv | AT ningli parallelimplementationofapriorialgorithmbasedonmapreduce AT lizeng parallelimplementationofapriorialgorithmbasedonmapreduce AT qinghe parallelimplementationofapriorialgorithmbasedonmapreduce AT zhongzhishi parallelimplementationofapriorialgorithmbasedonmapreduce |