A Monte Carlo-Based Search Strategy for Dimensionality Reduction in Performance Tuning Parameters

Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic...

Full description

Bibliographic Details
Main Authors: A. Omondi, I.A. Lukandu, G.W. Wanyembi
Format: Article
Language:English
Published: Shahrood University of Technology 2020-11-01
Series:Journal of Artificial Intelligence and Data Mining
Subjects:
Online Access:http://jad.shahroodut.ac.ir/article_1832_9bee8989299c0484c9ba5bba0735cb17.pdf
Description
Summary:Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search approach which uses lightweight random simulations to balance between the exploitation of relevant features and the exploration of features that have the potential to be relevant. In doing so, the study evaluated how effective the manipulation of the search component in feature selection is on achieving high accuracy with reduced dimensions. A control group experimental design was used to observe factual evidence. The context of the experiment was the high dimensional data experienced in performance tuning of complex database systems. The Wilcoxon signed-rank test at .05 level of significance was used to compare repeated classification accuracy measurements on the independent experiment and control group samples. Encouraging results with a p-value < 0.05 were recorded and provided evidence to reject the null hypothesis in favour of the alternative hypothesis which states that meta-heuristic search approaches are effective in achieving high accuracy with reduced dimensions depending on the outcome variable under investigation.
ISSN:2322-5211
2322-4444