Improved Genetic Algorithm for High-Utility Itemset Mining

High-utility itemset mining (HUIM) is an important research topic in the data mining field. Typically, traditional HUIM algorithms must handle the exponential problem of huge search space when the database size or number of distinct items is very large. As an alternative and effective approach, evol...

Full description

Bibliographic Details
Main Authors: Qiang Zhang, Wei Fang, Jun Sun, Quan Wang
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8926353/
Description
Summary:High-utility itemset mining (HUIM) is an important research topic in the data mining field. Typically, traditional HUIM algorithms must handle the exponential problem of huge search space when the database size or number of distinct items is very large. As an alternative and effective approach, evolutionary computation (EC)-based algorithms have been proposed to solve HUIM problems because they can obtain a set of nearly optimal solutions in limited time. However, it is still time-consuming for EC-based algorithms to find complete high-utility itemsets (HUIs) in transactional databases. To address this problem, we propose an HUIM algorithm based on an improved genetic algorithm (HUIM-IGA). In addition, a neighborhood exploration strategy is proposed to improve search efficiency for HUIs. To reduce missing HUIs, a population diversity maintenance strategy is employed in the proposed HUIM-IGA. An individual repair method is also introduced to reduce invalid combinations for discovering HUIs. In addition, an elite strategy is employed to prevent the loss of HUIs. Experimental results obtained on a set of real-world datasets demonstrate that the proposed algorithm can find complete HUIs in terms of the given minimum utility threshold, and the time-consuming of HUIM-IGA is relatively lower when mining the same number of HUIs than state-of-the-art EC-based HUIM algorithms.
ISSN:2169-3536