Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm

In recent years with the widely usage of mobile devices, the problem of SMS Spam increased dramatically. Receiving those undesired messages continuously can cause frustration to users. And sometimes it can be harmful, by sending SMS messages containing fake web pages in order to steal users’ confide...

Full description

Bibliographic Details
Main Authors: Diyari Jalal Mussa, Noor Ghazi M. Jameel
Format: Article
Language:English
Published: Sulaimani Polytechnic University 2019-11-01
Series:Kurdistan Journal of Applied Research
Subjects:
Online Access:https://kjar.spu.edu.iq/index.php/kjar/article/view/338
_version_ 1797197647309701120
author Diyari Jalal Mussa
Noor Ghazi M. Jameel
author_facet Diyari Jalal Mussa
Noor Ghazi M. Jameel
author_sort Diyari Jalal Mussa
collection DOAJ
description In recent years with the widely usage of mobile devices, the problem of SMS Spam increased dramatically. Receiving those undesired messages continuously can cause frustration to users. And sometimes it can be harmful, by sending SMS messages containing fake web pages in order to steal users’ confidential information. Besides spasm number of hazardous actions, there is a limited number of spam filtering software. According to this paper, XGBoost algorithm used for handling SMS spam detection problem. Number of structural features was collected from previous studies. 15 structural features were extracted from Tiago’s dataset, which is the most frequently used dataset by researchers. For selecting the optimal relevant features, two different types of wrapper feature selection algorithms were used in order to reduce and select best relevant features. The accuracy and performance obtained by the selected features via sequential backward selection method was better comparing to sequential forward selection method. The extracted nine optimal features can be a good representation of a spam SMS message. Additionally, the classification accuracy obtained by the proposed method using nine optimal features with XGBoost algorithm is 98.64 using 10-fold cross validation.
first_indexed 2024-03-07T14:10:24Z
format Article
id doaj.art-1024a4e8eef546888610c19410f7817e
institution Directory Open Access Journal
issn 2411-7684
2411-7706
language English
last_indexed 2024-04-24T06:47:17Z
publishDate 2019-11-01
publisher Sulaimani Polytechnic University
record_format Article
series Kurdistan Journal of Applied Research
spelling doaj.art-1024a4e8eef546888610c19410f7817e2024-04-22T17:19:20ZengSulaimani Polytechnic UniversityKurdistan Journal of Applied Research2411-76842411-77062019-11-014210.24017/science.2019.2.11338Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost AlgorithmDiyari Jalal Mussa0Noor Ghazi M. Jameel1Information technology Department, Technical College of Informatics, Sulaimani Polytechnic University, Sulaimani, IraqComputer Networks Department, Technical College of Informatics, Sulaimani Polytechnic University, Sulaimani, IraqIn recent years with the widely usage of mobile devices, the problem of SMS Spam increased dramatically. Receiving those undesired messages continuously can cause frustration to users. And sometimes it can be harmful, by sending SMS messages containing fake web pages in order to steal users’ confidential information. Besides spasm number of hazardous actions, there is a limited number of spam filtering software. According to this paper, XGBoost algorithm used for handling SMS spam detection problem. Number of structural features was collected from previous studies. 15 structural features were extracted from Tiago’s dataset, which is the most frequently used dataset by researchers. For selecting the optimal relevant features, two different types of wrapper feature selection algorithms were used in order to reduce and select best relevant features. The accuracy and performance obtained by the selected features via sequential backward selection method was better comparing to sequential forward selection method. The extracted nine optimal features can be a good representation of a spam SMS message. Additionally, the classification accuracy obtained by the proposed method using nine optimal features with XGBoost algorithm is 98.64 using 10-fold cross validation. https://kjar.spu.edu.iq/index.php/kjar/article/view/338SMS spam, wrapper methods, sequential feature selection, sequential forward selection, sequential backward selection, boosting classifier, extreme gradient boosting, XGBoost.
spellingShingle Diyari Jalal Mussa
Noor Ghazi M. Jameel
Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm
Kurdistan Journal of Applied Research
SMS spam, wrapper methods, sequential feature selection, sequential forward selection, sequential backward selection, boosting classifier, extreme gradient boosting, XGBoost.
title Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm
title_full Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm
title_fullStr Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm
title_full_unstemmed Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm
title_short Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm
title_sort relevant sms spam feature selection using wrapper approach and xgboost algorithm
topic SMS spam, wrapper methods, sequential feature selection, sequential forward selection, sequential backward selection, boosting classifier, extreme gradient boosting, XGBoost.
url https://kjar.spu.edu.iq/index.php/kjar/article/view/338
work_keys_str_mv AT diyarijalalmussa relevantsmsspamfeatureselectionusingwrapperapproachandxgboostalgorithm
AT noorghazimjameel relevantsmsspamfeatureselectionusingwrapperapproachandxgboostalgorithm