Multi-Objective Evolutionary Rule-Based Classification with Categorical Data

The ease of interpretation of a classification model is essential for the task of validating it. Sometimes it is required to clearly explain the classification process of a model’s predictions. Models which are inherently easier to interpret can be effortlessly related to the context of th...

Full description

Bibliographic Details
Main Authors: Fernando Jiménez, Carlos Martínez, Luis Miralles-Pechuán, Gracia Sánchez, Guido Sciavicco
Format: Article
Language:English
Published: MDPI AG 2018-09-01
Series:Entropy
Subjects:
Online Access:http://www.mdpi.com/1099-4300/20/9/684
_version_ 1811262856252882944
author Fernando Jiménez
Carlos Martínez
Luis Miralles-Pechuán
Gracia Sánchez
Guido Sciavicco
author_facet Fernando Jiménez
Carlos Martínez
Luis Miralles-Pechuán
Gracia Sánchez
Guido Sciavicco
author_sort Fernando Jiménez
collection DOAJ
description The ease of interpretation of a classification model is essential for the task of validating it. Sometimes it is required to clearly explain the classification process of a model’s predictions. Models which are inherently easier to interpret can be effortlessly related to the context of the problem, and their predictions can be, if necessary, ethically and legally evaluated. In this paper, we propose a novel method to generate rule-based classifiers from categorical data that can be readily interpreted. Classifiers are generated using a multi-objective optimization approach focusing on two main objectives: maximizing the performance of the learned classifier and minimizing its number of rules. The multi-objective evolutionary algorithms ENORA and NSGA-II have been adapted to optimize the performance of the classifier based on three different machine learning metrics: accuracy, area under the ROC curve, and root mean square error. We have extensively compared the generated classifiers using our proposed method with classifiers generated using classical methods such as PART, JRip, OneR and ZeroR. The experiments have been conducted in full training mode, in 10-fold cross-validation mode, and in train/test splitting mode. To make results reproducible, we have used the well-known and publicly available datasets Breast Cancer, Monk’s Problem 2, Tic-Tac-Toe-Endgame, Car, kr-vs-kp and Nursery. After performing an exhaustive statistical test on our results, we conclude that the proposed method is able to generate highly accurate and easy to interpret classification models.
first_indexed 2024-04-12T19:33:18Z
format Article
id doaj.art-95a3de4225944e51816c3322f732fd9d
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-04-12T19:33:18Z
publishDate 2018-09-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-95a3de4225944e51816c3322f732fd9d2022-12-22T03:19:16ZengMDPI AGEntropy1099-43002018-09-0120968410.3390/e20090684e20090684Multi-Objective Evolutionary Rule-Based Classification with Categorical DataFernando Jiménez0Carlos Martínez1Luis Miralles-Pechuán2Gracia Sánchez3Guido Sciavicco4Department of Information and Communication Engineering, University of Murcia, 30071 Murcia, SpainDepartment of Information and Communication Engineering, University of Murcia, 30071 Murcia, SpainCentre for Applied Data Analytics Research (CeADAR), University College Dublin, D04 Dublin 4, IrelandDepartment of Information and Communication Engineering, University of Murcia, 30071 Murcia, SpainDepartment of Mathematics and Computer Science, University of Ferrara, 44121 Ferrara, ItalyThe ease of interpretation of a classification model is essential for the task of validating it. Sometimes it is required to clearly explain the classification process of a model’s predictions. Models which are inherently easier to interpret can be effortlessly related to the context of the problem, and their predictions can be, if necessary, ethically and legally evaluated. In this paper, we propose a novel method to generate rule-based classifiers from categorical data that can be readily interpreted. Classifiers are generated using a multi-objective optimization approach focusing on two main objectives: maximizing the performance of the learned classifier and minimizing its number of rules. The multi-objective evolutionary algorithms ENORA and NSGA-II have been adapted to optimize the performance of the classifier based on three different machine learning metrics: accuracy, area under the ROC curve, and root mean square error. We have extensively compared the generated classifiers using our proposed method with classifiers generated using classical methods such as PART, JRip, OneR and ZeroR. The experiments have been conducted in full training mode, in 10-fold cross-validation mode, and in train/test splitting mode. To make results reproducible, we have used the well-known and publicly available datasets Breast Cancer, Monk’s Problem 2, Tic-Tac-Toe-Endgame, Car, kr-vs-kp and Nursery. After performing an exhaustive statistical test on our results, we conclude that the proposed method is able to generate highly accurate and easy to interpret classification models.http://www.mdpi.com/1099-4300/20/9/684multi-objective evolutionary algorithmsrule-based classifiersinterpretable machine learningcategorical data
spellingShingle Fernando Jiménez
Carlos Martínez
Luis Miralles-Pechuán
Gracia Sánchez
Guido Sciavicco
Multi-Objective Evolutionary Rule-Based Classification with Categorical Data
Entropy
multi-objective evolutionary algorithms
rule-based classifiers
interpretable machine learning
categorical data
title Multi-Objective Evolutionary Rule-Based Classification with Categorical Data
title_full Multi-Objective Evolutionary Rule-Based Classification with Categorical Data
title_fullStr Multi-Objective Evolutionary Rule-Based Classification with Categorical Data
title_full_unstemmed Multi-Objective Evolutionary Rule-Based Classification with Categorical Data
title_short Multi-Objective Evolutionary Rule-Based Classification with Categorical Data
title_sort multi objective evolutionary rule based classification with categorical data
topic multi-objective evolutionary algorithms
rule-based classifiers
interpretable machine learning
categorical data
url http://www.mdpi.com/1099-4300/20/9/684
work_keys_str_mv AT fernandojimenez multiobjectiveevolutionaryrulebasedclassificationwithcategoricaldata
AT carlosmartinez multiobjectiveevolutionaryrulebasedclassificationwithcategoricaldata
AT luismirallespechuan multiobjectiveevolutionaryrulebasedclassificationwithcategoricaldata
AT graciasanchez multiobjectiveevolutionaryrulebasedclassificationwithcategoricaldata
AT guidosciavicco multiobjectiveevolutionaryrulebasedclassificationwithcategoricaldata