Signature-based Tree for Finding Frequent Itemsets

The efficiency of a data mining process depends on the data structure used to find frequent itemsets. Two approaches are possible: use the original transaction dataset or transform it into another more compact structure. Many algorithms use trees as compact structure, like FP-Tree and the associated...

Full description

Bibliographic Details
Main Authors:	Mohamed El Hadi Benelhadj, Mohamed Mahmoud Deye, Yahya Slimani
Format:	Article
Language:	English
Published:	Croatian Communications and Information Society (CCIS) 2023-03-01
Series:	Journal of Communications Software and Systems
Subjects:	data mining data compression data storage tree structure signature
Online Access:	https://jcoms.fesb.unist.hr/10.24138/jcomss-2022-0065/

Description
Summary:	The efficiency of a data mining process depends on the data structure used to find frequent itemsets. Two approaches are possible: use the original transaction dataset or transform it into another more compact structure. Many algorithms use trees as compact structure, like FP-Tree and the associated algorithm FP-Growth. Although this structure reduces the number of scans (only 2), its efficiency depends on two criteria: (i) the size of the support (small or large); (ii) the type of transaction dataset (sparse or dense). But these two criteria can generate very large trees. In this paper, we propose a new tree-based structure that emphasizes on transactions and not on itemsets. Hence, we avoid the problem of support values that have a negative impact on the generated tree.
ISSN:	1845-6421 1846-6079

Signature-based Tree for Finding Frequent Itemsets

Similar Items