Poverty prediction using E-commerce dataset and filter-based feature selection approach

Abstract Poverty is a problem that occurs in many countries, notably in Indonesia. The common methods used to obtain poverty information are surveys and censuses. However, this process takes a long time and uses a lot of human resources. On the other hand, governments and policymakers need a faster...

Full description

Bibliographic Details
Main Authors: Dedy Rahman Wijaya, Raden Ilham Fadhilah Ibadurrohman, Elis Hernawati, Wawa Wikusna
Format: Article
Language:English
Published: Nature Portfolio 2024-02-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-024-52752-7
_version_ 1797275190849175552
author Dedy Rahman Wijaya
Raden Ilham Fadhilah Ibadurrohman
Elis Hernawati
Wawa Wikusna
author_facet Dedy Rahman Wijaya
Raden Ilham Fadhilah Ibadurrohman
Elis Hernawati
Wawa Wikusna
author_sort Dedy Rahman Wijaya
collection DOAJ
description Abstract Poverty is a problem that occurs in many countries, notably in Indonesia. The common methods used to obtain poverty information are surveys and censuses. However, this process takes a long time and uses a lot of human resources. On the other hand, governments and policymakers need a faster approach to know social-economic conditions for area development plans. Hence, in this paper, we develop e-commerce data and machine learning algorithms as a proxy for poverty levels that can provide faster information than surveys or censuses. The e-commerce dataset is used and this high-dimensional data becomes a challenge. Hence, feature selection algorithms are employed to determine the best features before building a machine learning model. Furthermore, three machine learning algorithms such as support vector regression, linear regression, and k-nearest neighbor are compared to predict the poverty rate. Hence, the contribution of this paper is to propose the combination of statistical-based feature selection and machine learning algorithms to predict the poverty rate based on e-commerce data. According to the experimental results, the combination of f-score feature selection and support vector regression surpasses other methods. It shows that e-commerce data and machine learning algorithms can be potentially used as a proxy for predicting poverty.
first_indexed 2024-03-07T15:09:48Z
format Article
id doaj.art-150d4fb725ae4a308b2cb4e5b51da375
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-03-07T15:09:48Z
publishDate 2024-02-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-150d4fb725ae4a308b2cb4e5b51da3752024-03-05T18:40:25ZengNature PortfolioScientific Reports2045-23222024-02-0114111310.1038/s41598-024-52752-7Poverty prediction using E-commerce dataset and filter-based feature selection approachDedy Rahman Wijaya0Raden Ilham Fadhilah Ibadurrohman1Elis Hernawati2Wawa Wikusna3School of Applied Science, Telkom UniversitySchool of Applied Science, Telkom UniversitySchool of Applied Science, Telkom UniversitySchool of Applied Science, Telkom UniversityAbstract Poverty is a problem that occurs in many countries, notably in Indonesia. The common methods used to obtain poverty information are surveys and censuses. However, this process takes a long time and uses a lot of human resources. On the other hand, governments and policymakers need a faster approach to know social-economic conditions for area development plans. Hence, in this paper, we develop e-commerce data and machine learning algorithms as a proxy for poverty levels that can provide faster information than surveys or censuses. The e-commerce dataset is used and this high-dimensional data becomes a challenge. Hence, feature selection algorithms are employed to determine the best features before building a machine learning model. Furthermore, three machine learning algorithms such as support vector regression, linear regression, and k-nearest neighbor are compared to predict the poverty rate. Hence, the contribution of this paper is to propose the combination of statistical-based feature selection and machine learning algorithms to predict the poverty rate based on e-commerce data. According to the experimental results, the combination of f-score feature selection and support vector regression surpasses other methods. It shows that e-commerce data and machine learning algorithms can be potentially used as a proxy for predicting poverty.https://doi.org/10.1038/s41598-024-52752-7
spellingShingle Dedy Rahman Wijaya
Raden Ilham Fadhilah Ibadurrohman
Elis Hernawati
Wawa Wikusna
Poverty prediction using E-commerce dataset and filter-based feature selection approach
Scientific Reports
title Poverty prediction using E-commerce dataset and filter-based feature selection approach
title_full Poverty prediction using E-commerce dataset and filter-based feature selection approach
title_fullStr Poverty prediction using E-commerce dataset and filter-based feature selection approach
title_full_unstemmed Poverty prediction using E-commerce dataset and filter-based feature selection approach
title_short Poverty prediction using E-commerce dataset and filter-based feature selection approach
title_sort poverty prediction using e commerce dataset and filter based feature selection approach
url https://doi.org/10.1038/s41598-024-52752-7
work_keys_str_mv AT dedyrahmanwijaya povertypredictionusingecommercedatasetandfilterbasedfeatureselectionapproach
AT radenilhamfadhilahibadurrohman povertypredictionusingecommercedatasetandfilterbasedfeatureselectionapproach
AT elishernawati povertypredictionusingecommercedatasetandfilterbasedfeatureselectionapproach
AT wawawikusna povertypredictionusingecommercedatasetandfilterbasedfeatureselectionapproach