Data mining reduction methods and performances of rules

In data mining the accuracy of models are associated with the strength of the rules.However, most machine learning techniques produce a large number of rules.The consequence is with large number of rules generated,processing time is much longer. This study examines rules of different lengths of attr...

Full description

Bibliographic Details
Main Authors: Ahmad, Faudziah, Basir, Mohammad Aizat
Format: Conference or Workshop Item
Language:English
Published: 2009
Subjects:
Online Access:https://repo.uum.edu.my/id/eprint/13596/1/PID263.pdf
Description
Summary:In data mining the accuracy of models are associated with the strength of the rules.However, most machine learning techniques produce a large number of rules.The consequence is with large number of rules generated,processing time is much longer. This study examines rules of different lengths of attributes in terms of performance based on percentage of accuracy. The research adopts the Knowledge Discovery in Databases “KDD” methodology for analysis and applies various data mining techniques in the experiments.Data of 50 hardware dataset companies which, contains 31 attributes and 400 records have been used. In summary, results show that in terms of performance of rules, Genetic Algorithm has produced the highest number of rules followed by Johnson’s Algorithm and Holte’s 1R.The best classifier for extracting rules in this study is VOT (Voting of Object Tracking).In terms of performance of rules, best results comes from rules with 30 attributes, followed by rules with 1 intersection attribute and lastly rules with 3 intersection attributes. Among the three sets of attributes, the 3 intersection attributes are considered as the attributes that can be used as predictor attributes.