Multi-Objective Evolutionary Instance Selection for Regression Tasks

The purpose of instance selection is to reduce the data size while preserving as much useful information stored in the data as possible and detecting and removing the erroneous and redundant information. In this work, we analyze instance selection in regression tasks and apply the NSGA-II multi-obje...

Full description

Bibliographic Details
Main Authors: Mirosław Kordos, Krystian Łapa
Format: Article
Language:English
Published: MDPI AG 2018-09-01
Series:Entropy
Subjects:
Online Access:http://www.mdpi.com/1099-4300/20/10/746
Description
Summary:The purpose of instance selection is to reduce the data size while preserving as much useful information stored in the data as possible and detecting and removing the erroneous and redundant information. In this work, we analyze instance selection in regression tasks and apply the NSGA-II multi-objective evolutionary algorithm to direct the search for the optimal subset of the training dataset and the k-NN algorithm for evaluating the solutions during the selection process. A key advantage of the method is obtaining a pool of solutions situated on the Pareto front, where each of them is the best for certain RMSE-compression balance. We discuss different parameters of the process and their influence on the results and put special efforts to reducing the computational complexity of our approach. The experimental evaluation proves that the proposed method achieves good performance in terms of minimization of prediction error and minimization of dataset size.
ISSN:1099-4300