Ensemble Methods in Customer Churn Prediction: A Comparative Analysis of the State-of-the-Art

In the past several single classifiers, homogeneous and heterogeneous ensembles have been proposed to detect the customers who are most likely to churn. Despite the popularity and accuracy of heterogeneous ensembles in various domains, customer churn prediction models have not yet been picked up. Mo...

Full description

Bibliographic Details
Main Authors: Matthias Bogaert, Lex Delaere
Format: Article
Language:English
Published: MDPI AG 2023-02-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/11/5/1137
_version_ 1827752522878550016
author Matthias Bogaert
Lex Delaere
author_facet Matthias Bogaert
Lex Delaere
author_sort Matthias Bogaert
collection DOAJ
description In the past several single classifiers, homogeneous and heterogeneous ensembles have been proposed to detect the customers who are most likely to churn. Despite the popularity and accuracy of heterogeneous ensembles in various domains, customer churn prediction models have not yet been picked up. Moreover, there are other developments in the performance evaluation and model comparison level that have not been introduced in a systematic way. Therefore, the aim of this study is to perform a large scale benchmark study in customer churn prediction implementing these novel methods. To do so, we benchmark 33 classifiers, including 6 single classifiers, 14 homogeneous, and 13 heterogeneous ensembles across 11 datasets. Our findings indicate that heterogeneous ensembles are consistently ranked higher than homogeneous ensembles and single classifiers. It is observed that a heterogeneous ensemble with simulated annealing classifier selection is ranked the highest in terms of AUC and expected maximum profits. For accuracy, F1 measure and top-decile lift, a heterogenous ensemble optimized by non-negative binomial likelihood, and a stacked heterogeneous ensemble are, respectively, the top ranked classifiers. Our study contributes to the literature by being the first to include such an extensive set of classifiers, performance metrics, and statistical tests in a benchmark study of customer churn.
first_indexed 2024-03-11T07:18:32Z
format Article
id doaj.art-6623e9203578403aa1ea1dbef9399bc4
institution Directory Open Access Journal
issn 2227-7390
language English
last_indexed 2024-03-11T07:18:32Z
publishDate 2023-02-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj.art-6623e9203578403aa1ea1dbef9399bc42023-11-17T08:08:43ZengMDPI AGMathematics2227-73902023-02-01115113710.3390/math11051137Ensemble Methods in Customer Churn Prediction: A Comparative Analysis of the State-of-the-ArtMatthias Bogaert0Lex Delaere1Departement of Marketing, Innovation and Organization, Ghent University, 9000 Ghent, BelgiumDepartement of Marketing, Innovation and Organization, Ghent University, 9000 Ghent, BelgiumIn the past several single classifiers, homogeneous and heterogeneous ensembles have been proposed to detect the customers who are most likely to churn. Despite the popularity and accuracy of heterogeneous ensembles in various domains, customer churn prediction models have not yet been picked up. Moreover, there are other developments in the performance evaluation and model comparison level that have not been introduced in a systematic way. Therefore, the aim of this study is to perform a large scale benchmark study in customer churn prediction implementing these novel methods. To do so, we benchmark 33 classifiers, including 6 single classifiers, 14 homogeneous, and 13 heterogeneous ensembles across 11 datasets. Our findings indicate that heterogeneous ensembles are consistently ranked higher than homogeneous ensembles and single classifiers. It is observed that a heterogeneous ensemble with simulated annealing classifier selection is ranked the highest in terms of AUC and expected maximum profits. For accuracy, F1 measure and top-decile lift, a heterogenous ensemble optimized by non-negative binomial likelihood, and a stacked heterogeneous ensemble are, respectively, the top ranked classifiers. Our study contributes to the literature by being the first to include such an extensive set of classifiers, performance metrics, and statistical tests in a benchmark study of customer churn.https://www.mdpi.com/2227-7390/11/5/1137churn predictionensemble methodsmachine learningdata miningCRM
spellingShingle Matthias Bogaert
Lex Delaere
Ensemble Methods in Customer Churn Prediction: A Comparative Analysis of the State-of-the-Art
Mathematics
churn prediction
ensemble methods
machine learning
data mining
CRM
title Ensemble Methods in Customer Churn Prediction: A Comparative Analysis of the State-of-the-Art
title_full Ensemble Methods in Customer Churn Prediction: A Comparative Analysis of the State-of-the-Art
title_fullStr Ensemble Methods in Customer Churn Prediction: A Comparative Analysis of the State-of-the-Art
title_full_unstemmed Ensemble Methods in Customer Churn Prediction: A Comparative Analysis of the State-of-the-Art
title_short Ensemble Methods in Customer Churn Prediction: A Comparative Analysis of the State-of-the-Art
title_sort ensemble methods in customer churn prediction a comparative analysis of the state of the art
topic churn prediction
ensemble methods
machine learning
data mining
CRM
url https://www.mdpi.com/2227-7390/11/5/1137
work_keys_str_mv AT matthiasbogaert ensemblemethodsincustomerchurnpredictionacomparativeanalysisofthestateoftheart
AT lexdelaere ensemblemethodsincustomerchurnpredictionacomparativeanalysisofthestateoftheart