Combining Data Envelopment Analysis and Machine Learning

Data Envelopment Analysis (DEA) is one of the most used non-parametric techniques for technical efficiency assessment. DEA is exclusively concerned about the minimization of the empirical error, satisfying, at the same time, some shape constraints (convexity and free disposability). Unfortunately, b...

Full description

Bibliographic Details
Main Authors: Nadia M. Guerrero, Juan Aparicio, Daniel Valero-Carreras
Format: Article
Language:English
Published: MDPI AG 2022-03-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/10/6/909
_version_ 1797445437589815296
author Nadia M. Guerrero
Juan Aparicio
Daniel Valero-Carreras
author_facet Nadia M. Guerrero
Juan Aparicio
Daniel Valero-Carreras
author_sort Nadia M. Guerrero
collection DOAJ
description Data Envelopment Analysis (DEA) is one of the most used non-parametric techniques for technical efficiency assessment. DEA is exclusively concerned about the minimization of the empirical error, satisfying, at the same time, some shape constraints (convexity and free disposability). Unfortunately, by construction, DEA is a descriptive methodology that is not concerned about preventing overfitting. In this paper, we introduce a new methodology that allows for estimating polyhedral technologies following the Structural Risk Minimization (SRM) principle. This technique is called Data Envelopment Analysis-based Machines (DEAM). Given that the new method controls the generalization error of the model, the corresponding estimate of the technology does not suffer from overfitting. Moreover, the notion of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ε</mi></semantics></math></inline-formula>-insensitivity is also introduced, generating a new and more robust definition of technical efficiency. Additionally, we show that DEAM can be seen as a machine learning-type extension of DEA, satisfying the same microeconomic postulates except for minimal extrapolation. Finally, the performance of DEAM is evaluated through simulations. We conclude that the frontier estimator derived from DEAM is better than that associated with DEA. The bias and mean squared error obtained for DEAM are smaller in all the scenarios analyzed, regardless of the number of variables and DMUs.
first_indexed 2024-03-09T13:26:47Z
format Article
id doaj.art-edf75629d4f4477094e09cda77e761d0
institution Directory Open Access Journal
issn 2227-7390
language English
last_indexed 2024-03-09T13:26:47Z
publishDate 2022-03-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj.art-edf75629d4f4477094e09cda77e761d02023-11-30T21:23:53ZengMDPI AGMathematics2227-73902022-03-0110690910.3390/math10060909Combining Data Envelopment Analysis and Machine LearningNadia M. Guerrero0Juan Aparicio1Daniel Valero-Carreras2Center of Operations Research (CIO), Miguel Hernandez University of Elche (UMH), 03202 Elche, SpainCenter of Operations Research (CIO), Miguel Hernandez University of Elche (UMH), 03202 Elche, SpainCenter of Operations Research (CIO), Miguel Hernandez University of Elche (UMH), 03202 Elche, SpainData Envelopment Analysis (DEA) is one of the most used non-parametric techniques for technical efficiency assessment. DEA is exclusively concerned about the minimization of the empirical error, satisfying, at the same time, some shape constraints (convexity and free disposability). Unfortunately, by construction, DEA is a descriptive methodology that is not concerned about preventing overfitting. In this paper, we introduce a new methodology that allows for estimating polyhedral technologies following the Structural Risk Minimization (SRM) principle. This technique is called Data Envelopment Analysis-based Machines (DEAM). Given that the new method controls the generalization error of the model, the corresponding estimate of the technology does not suffer from overfitting. Moreover, the notion of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ε</mi></semantics></math></inline-formula>-insensitivity is also introduced, generating a new and more robust definition of technical efficiency. Additionally, we show that DEAM can be seen as a machine learning-type extension of DEA, satisfying the same microeconomic postulates except for minimal extrapolation. Finally, the performance of DEAM is evaluated through simulations. We conclude that the frontier estimator derived from DEAM is better than that associated with DEA. The bias and mean squared error obtained for DEAM are smaller in all the scenarios analyzed, regardless of the number of variables and DMUs.https://www.mdpi.com/2227-7390/10/6/909data envelopment analysisPAC learningsupport vector regressionmachine learningstructural risk minimization
spellingShingle Nadia M. Guerrero
Juan Aparicio
Daniel Valero-Carreras
Combining Data Envelopment Analysis and Machine Learning
Mathematics
data envelopment analysis
PAC learning
support vector regression
machine learning
structural risk minimization
title Combining Data Envelopment Analysis and Machine Learning
title_full Combining Data Envelopment Analysis and Machine Learning
title_fullStr Combining Data Envelopment Analysis and Machine Learning
title_full_unstemmed Combining Data Envelopment Analysis and Machine Learning
title_short Combining Data Envelopment Analysis and Machine Learning
title_sort combining data envelopment analysis and machine learning
topic data envelopment analysis
PAC learning
support vector regression
machine learning
structural risk minimization
url https://www.mdpi.com/2227-7390/10/6/909
work_keys_str_mv AT nadiamguerrero combiningdataenvelopmentanalysisandmachinelearning
AT juanaparicio combiningdataenvelopmentanalysisandmachinelearning
AT danielvalerocarreras combiningdataenvelopmentanalysisandmachinelearning