SDFP-Growth Algorithm as a Novelty of Association Rule Mining Optimization

An essential element of association rules is the strong confidence values that depend on the support value threshold, which determines the optimum number of datasets. The existing method for determining the support value threshold is carried out manually by trial and error; the user determines a sup...

Full description

Bibliographic Details
Main Authors:	Boby Siswanto, Haryono Soeparno, Nesti Fronika Sianipar, Widodo Budiharto
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Association rule mining SDFP-growth algorithm dimensionality reduction optimization FP-tree pruning
Online Access:	https://ieeexplore.ieee.org/document/10418933/

_version_	1797311643817869312
author	Boby Siswanto Haryono Soeparno Nesti Fronika Sianipar Widodo Budiharto
author_facet	Boby Siswanto Haryono Soeparno Nesti Fronika Sianipar Widodo Budiharto
author_sort	Boby Siswanto
collection	DOAJ
description	An essential element of association rules is the strong confidence values that depend on the support value threshold, which determines the optimum number of datasets. The existing method for determining the support value threshold is carried out manually by trial and error; the user determines a support value such as 10%, 30%, or 60% according to their instincts. If the support value threshold is inappropriate, it produces useless frequent patterns, overburdens computer resources, and wastes time. The formula for predicting the maximum count of frequent patterns was 2n – 1, where <inline-formula> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula> is the number of distinct items in the dataset. This paper proposes a new SDFP-growth algorithm that does not require manual determination of the support threshold value. The SDFP-growth algorithm will perform dimensionality reduction on the original dataset that will generate level 1 and level 2 smaller datasets, thus automatically producing a dataset with an optimum amount of data with a minimum support value threshold. The proposed formula for predicting the maximum number of frequent patterns will become 2<inline-formula> <tex-math notation="LaTeX">$^{\vert A\vert }$ </tex-math></inline-formula> - 1, which is <inline-formula> <tex-math notation="LaTeX">$\vert A \vert $ </tex-math></inline-formula> will always be smaller than <inline-formula> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula>. Experiments were performed on five various datasets, which reduced the number of data dimensions by more than 3% on the Level 1 dataset and more than 69% on the Level 2 dataset by maintaining the confidence value of the strong rules. In the execution time evaluated, we found an optimization of more than 2% on the level 1 dataset and more than 94% on the level 2 dataset.
first_indexed	2024-03-08T02:03:38Z
format	Article
id	doaj.art-4eb404b4dd974ff595bb801da9644b95
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-08T02:03:38Z
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-4eb404b4dd974ff595bb801da9644b952024-02-14T00:01:51ZengIEEEIEEE Access2169-35362024-01-0112214912150210.1109/ACCESS.2024.336166710418933SDFP-Growth Algorithm as a Novelty of Association Rule Mining OptimizationBoby Siswanto0https://orcid.org/0000-0002-1754-3867Haryono Soeparno1Nesti Fronika Sianipar2Widodo Budiharto3https://orcid.org/0000-0003-2681-0901Computer Science Department, BINUS Graduate Program-Doctor of Computer Science, Bina Nusantara University, South Jakarta, IndonesiaComputer Science Department, BINUS Graduate Program-Doctor of Computer Science, Bina Nusantara University, South Jakarta, IndonesiaBiotechnology Department, Faculty of Engineering, Bina Nusantara University, South Jakarta, IndonesiaComputer Science Department, School of Computer Science, Bina Nusantara University, South Jakarta, IndonesiaAn essential element of association rules is the strong confidence values that depend on the support value threshold, which determines the optimum number of datasets. The existing method for determining the support value threshold is carried out manually by trial and error; the user determines a support value such as 10%, 30%, or 60% according to their instincts. If the support value threshold is inappropriate, it produces useless frequent patterns, overburdens computer resources, and wastes time. The formula for predicting the maximum count of frequent patterns was 2n – 1, where <inline-formula> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula> is the number of distinct items in the dataset. This paper proposes a new SDFP-growth algorithm that does not require manual determination of the support threshold value. The SDFP-growth algorithm will perform dimensionality reduction on the original dataset that will generate level 1 and level 2 smaller datasets, thus automatically producing a dataset with an optimum amount of data with a minimum support value threshold. The proposed formula for predicting the maximum number of frequent patterns will become 2<inline-formula> <tex-math notation="LaTeX">$^{\vert A\vert }$ </tex-math></inline-formula> - 1, which is <inline-formula> <tex-math notation="LaTeX">$\vert A \vert $ </tex-math></inline-formula> will always be smaller than <inline-formula> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula>. Experiments were performed on five various datasets, which reduced the number of data dimensions by more than 3% on the Level 1 dataset and more than 69% on the Level 2 dataset by maintaining the confidence value of the strong rules. In the execution time evaluated, we found an optimization of more than 2% on the level 1 dataset and more than 94% on the level 2 dataset.https://ieeexplore.ieee.org/document/10418933/Association rule miningSDFP-growth algorithmdimensionality reductionoptimizationFP-tree pruning
spellingShingle	Boby Siswanto Haryono Soeparno Nesti Fronika Sianipar Widodo Budiharto SDFP-Growth Algorithm as a Novelty of Association Rule Mining Optimization IEEE Access Association rule mining SDFP-growth algorithm dimensionality reduction optimization FP-tree pruning
title	SDFP-Growth Algorithm as a Novelty of Association Rule Mining Optimization
title_full	SDFP-Growth Algorithm as a Novelty of Association Rule Mining Optimization
title_fullStr	SDFP-Growth Algorithm as a Novelty of Association Rule Mining Optimization
title_full_unstemmed	SDFP-Growth Algorithm as a Novelty of Association Rule Mining Optimization
title_short	SDFP-Growth Algorithm as a Novelty of Association Rule Mining Optimization
title_sort	sdfp growth algorithm as a novelty of association rule mining optimization
topic	Association rule mining SDFP-growth algorithm dimensionality reduction optimization FP-tree pruning
url	https://ieeexplore.ieee.org/document/10418933/
work_keys_str_mv	AT bobysiswanto sdfpgrowthalgorithmasanoveltyofassociationruleminingoptimization AT haryonosoeparno sdfpgrowthalgorithmasanoveltyofassociationruleminingoptimization AT nestifronikasianipar sdfpgrowthalgorithmasanoveltyofassociationruleminingoptimization AT widodobudiharto sdfpgrowthalgorithmasanoveltyofassociationruleminingoptimization

SDFP-Growth Algorithm as a Novelty of Association Rule Mining Optimization

Similar Items