Parallel Implementation of Apriori Algorithm Based on MapReduce

Searching frequent patterns in transactional databases is considered as one of the most important data mining problems and Apriori is one of the typical algorithms for this task. Developing fast and efficient algorithms that can handle large volumes of data becomes a challenging task due to the larg...

Full description

Bibliographic Details
Main Authors: Ning Li, Li Zeng, Qing He, Zhongzhi Shi
Format: Article
Language:English
Published: Springer 2013-04-01
Series:International Journal of Networked and Distributed Computing (IJNDC)
Subjects:
Online Access:https://www.atlantis-press.com/article/8360.pdf
_version_ 1797934587819589632
author Ning Li
Li Zeng
Qing He
Zhongzhi Shi
author_facet Ning Li
Li Zeng
Qing He
Zhongzhi Shi
author_sort Ning Li
collection DOAJ
description Searching frequent patterns in transactional databases is considered as one of the most important data mining problems and Apriori is one of the typical algorithms for this task. Developing fast and efficient algorithms that can handle large volumes of data becomes a challenging task due to the large databases. In this paper, we implement a parallel Apriori algorithm based on MapReduce, which is a framework for processing huge datasets on certain kinds of distributable problems using a large number of computers (nodes). The experimental results demonstrate that the proposed algorithm can scale well and efficiently process large datasets on commodity hardware.
first_indexed 2024-04-10T18:01:20Z
format Article
id doaj.art-1f51cbb4a0e64de09c0cea09024386e0
institution Directory Open Access Journal
issn 2211-7946
language English
last_indexed 2024-04-10T18:01:20Z
publishDate 2013-04-01
publisher Springer
record_format Article
series International Journal of Networked and Distributed Computing (IJNDC)
spelling doaj.art-1f51cbb4a0e64de09c0cea09024386e02023-02-02T15:05:25ZengSpringerInternational Journal of Networked and Distributed Computing (IJNDC)2211-79462013-04-011210.2991/ijndc.2013.1.2.3Parallel Implementation of Apriori Algorithm Based on MapReduceNing LiLi ZengQing HeZhongzhi ShiSearching frequent patterns in transactional databases is considered as one of the most important data mining problems and Apriori is one of the typical algorithms for this task. Developing fast and efficient algorithms that can handle large volumes of data becomes a challenging task due to the large databases. In this paper, we implement a parallel Apriori algorithm based on MapReduce, which is a framework for processing huge datasets on certain kinds of distributable problems using a large number of computers (nodes). The experimental results demonstrate that the proposed algorithm can scale well and efficiently process large datasets on commodity hardware.https://www.atlantis-press.com/article/8360.pdfApriori algorithm; Frequent itemsets; MapReduce; Parallel implementation; Large database.
spellingShingle Ning Li
Li Zeng
Qing He
Zhongzhi Shi
Parallel Implementation of Apriori Algorithm Based on MapReduce
International Journal of Networked and Distributed Computing (IJNDC)
Apriori algorithm; Frequent itemsets; MapReduce; Parallel implementation; Large database.
title Parallel Implementation of Apriori Algorithm Based on MapReduce
title_full Parallel Implementation of Apriori Algorithm Based on MapReduce
title_fullStr Parallel Implementation of Apriori Algorithm Based on MapReduce
title_full_unstemmed Parallel Implementation of Apriori Algorithm Based on MapReduce
title_short Parallel Implementation of Apriori Algorithm Based on MapReduce
title_sort parallel implementation of apriori algorithm based on mapreduce
topic Apriori algorithm; Frequent itemsets; MapReduce; Parallel implementation; Large database.
url https://www.atlantis-press.com/article/8360.pdf
work_keys_str_mv AT ningli parallelimplementationofapriorialgorithmbasedonmapreduce
AT lizeng parallelimplementationofapriorialgorithmbasedonmapreduce
AT qinghe parallelimplementationofapriorialgorithmbasedonmapreduce
AT zhongzhishi parallelimplementationofapriorialgorithmbasedonmapreduce