A Sampling-Based Stack Framework for Imbalanced Learning in Churn Prediction
Churn prediction is gaining popularity in the research community as a powerful paradigm that supports data-driven operational decisions. Datasets related to churn prediction are often skewed with imbalanced class distribution. Data-level solutions, like over-sampling and under-sampling, have been co...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9803037/ |
_version_ | 1811342167925325824 |
---|---|
author | Soumi De P. Prabu |
author_facet | Soumi De P. Prabu |
author_sort | Soumi De |
collection | DOAJ |
description | Churn prediction is gaining popularity in the research community as a powerful paradigm that supports data-driven operational decisions. Datasets related to churn prediction are often skewed with imbalanced class distribution. Data-level solutions, like over-sampling and under-sampling, have been commonly used by researchers to address this problem. There are limited number of case studies that attempt to evolve these data-level solutions by integrating them with computationally advanced frameworks, like ensembles. Ensembles primarily employ algorithmic diversity using a fixed set of training instances to achieve superior performance. This study aims to introduce algorithmic diversity in ensembles by modifying the fixed set of training instances using diverse sampling strategies to increase predictive performance in imbalanced learning. Data is acquired from the world’s largest open hotel commerce platform company. A four-part series of experiments is conducted to analyze the effectiveness of sampling techniques and ensemble solutions on model performance. A new sampling-based stack framework called “Stacking of Samplers for Imbalanced Learning” is proposed. The framework combines the prediction capabilities of sampling solutions to stimulate the information gain of the meta features in ensemble. It is observed that the proposed framework leads to improvement in model performance with AUC of 86.4% and top-decile lift of 4.7 for customers of the hotel technology provider. Additionally, results show that the framework records a higher information gain for meta features used in a stack, compared to commonly used stack frameworks. |
first_indexed | 2024-04-13T19:06:42Z |
format | Article |
id | doaj.art-361285820c4f4e328aafa4bdbd83a948 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-13T19:06:42Z |
publishDate | 2022-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-361285820c4f4e328aafa4bdbd83a9482022-12-22T02:33:57ZengIEEEIEEE Access2169-35362022-01-0110680176802810.1109/ACCESS.2022.31852279803037A Sampling-Based Stack Framework for Imbalanced Learning in Churn PredictionSoumi De0https://orcid.org/0000-0002-4606-796XP. Prabu1Department of Data Science, CHRIST (Deemed to be University), Bengaluru, IndiaDepartment of Computer Science, CHRIST (Deemed to be University), Bengaluru, IndiaChurn prediction is gaining popularity in the research community as a powerful paradigm that supports data-driven operational decisions. Datasets related to churn prediction are often skewed with imbalanced class distribution. Data-level solutions, like over-sampling and under-sampling, have been commonly used by researchers to address this problem. There are limited number of case studies that attempt to evolve these data-level solutions by integrating them with computationally advanced frameworks, like ensembles. Ensembles primarily employ algorithmic diversity using a fixed set of training instances to achieve superior performance. This study aims to introduce algorithmic diversity in ensembles by modifying the fixed set of training instances using diverse sampling strategies to increase predictive performance in imbalanced learning. Data is acquired from the world’s largest open hotel commerce platform company. A four-part series of experiments is conducted to analyze the effectiveness of sampling techniques and ensemble solutions on model performance. A new sampling-based stack framework called “Stacking of Samplers for Imbalanced Learning” is proposed. The framework combines the prediction capabilities of sampling solutions to stimulate the information gain of the meta features in ensemble. It is observed that the proposed framework leads to improvement in model performance with AUC of 86.4% and top-decile lift of 4.7 for customers of the hotel technology provider. Additionally, results show that the framework records a higher information gain for meta features used in a stack, compared to commonly used stack frameworks.https://ieeexplore.ieee.org/document/9803037/Churn predictionensemble classifiersover-samplingunder-samplingensemble stack |
spellingShingle | Soumi De P. Prabu A Sampling-Based Stack Framework for Imbalanced Learning in Churn Prediction IEEE Access Churn prediction ensemble classifiers over-sampling under-sampling ensemble stack |
title | A Sampling-Based Stack Framework for Imbalanced Learning in Churn Prediction |
title_full | A Sampling-Based Stack Framework for Imbalanced Learning in Churn Prediction |
title_fullStr | A Sampling-Based Stack Framework for Imbalanced Learning in Churn Prediction |
title_full_unstemmed | A Sampling-Based Stack Framework for Imbalanced Learning in Churn Prediction |
title_short | A Sampling-Based Stack Framework for Imbalanced Learning in Churn Prediction |
title_sort | sampling based stack framework for imbalanced learning in churn prediction |
topic | Churn prediction ensemble classifiers over-sampling under-sampling ensemble stack |
url | https://ieeexplore.ieee.org/document/9803037/ |
work_keys_str_mv | AT soumide asamplingbasedstackframeworkforimbalancedlearninginchurnprediction AT pprabu asamplingbasedstackframeworkforimbalancedlearninginchurnprediction AT soumide samplingbasedstackframeworkforimbalancedlearninginchurnprediction AT pprabu samplingbasedstackframeworkforimbalancedlearninginchurnprediction |