Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm
In recent years with the widely usage of mobile devices, the problem of SMS Spam increased dramatically. Receiving those undesired messages continuously can cause frustration to users. And sometimes it can be harmful, by sending SMS messages containing fake web pages in order to steal users’ confide...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Sulaimani Polytechnic University
2019-11-01
|
Series: | Kurdistan Journal of Applied Research |
Subjects: | |
Online Access: | https://kjar.spu.edu.iq/index.php/kjar/article/view/338 |
_version_ | 1797197647309701120 |
---|---|
author | Diyari Jalal Mussa Noor Ghazi M. Jameel |
author_facet | Diyari Jalal Mussa Noor Ghazi M. Jameel |
author_sort | Diyari Jalal Mussa |
collection | DOAJ |
description | In recent years with the widely usage of mobile devices, the problem of SMS Spam increased dramatically. Receiving those undesired messages continuously can cause frustration to users. And sometimes it can be harmful, by sending SMS messages containing fake web pages in order to steal users’ confidential information. Besides spasm number of hazardous actions, there is a limited number of spam filtering software. According to this paper, XGBoost algorithm used for handling SMS spam detection problem. Number of structural features was collected from previous studies. 15 structural features were extracted from Tiago’s dataset, which is the most frequently used dataset by researchers. For selecting the optimal relevant features, two different types of wrapper feature selection algorithms were used in order to reduce and select best relevant features. The accuracy and performance obtained by the selected features via sequential backward selection method was better comparing to sequential forward selection method. The extracted nine optimal features can be a good representation of a spam SMS message. Additionally, the classification accuracy obtained by the proposed method using nine optimal features with XGBoost algorithm is 98.64 using 10-fold cross validation.
|
first_indexed | 2024-03-07T14:10:24Z |
format | Article |
id | doaj.art-1024a4e8eef546888610c19410f7817e |
institution | Directory Open Access Journal |
issn | 2411-7684 2411-7706 |
language | English |
last_indexed | 2024-04-24T06:47:17Z |
publishDate | 2019-11-01 |
publisher | Sulaimani Polytechnic University |
record_format | Article |
series | Kurdistan Journal of Applied Research |
spelling | doaj.art-1024a4e8eef546888610c19410f7817e2024-04-22T17:19:20ZengSulaimani Polytechnic UniversityKurdistan Journal of Applied Research2411-76842411-77062019-11-014210.24017/science.2019.2.11338Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost AlgorithmDiyari Jalal Mussa0Noor Ghazi M. Jameel1Information technology Department, Technical College of Informatics, Sulaimani Polytechnic University, Sulaimani, IraqComputer Networks Department, Technical College of Informatics, Sulaimani Polytechnic University, Sulaimani, IraqIn recent years with the widely usage of mobile devices, the problem of SMS Spam increased dramatically. Receiving those undesired messages continuously can cause frustration to users. And sometimes it can be harmful, by sending SMS messages containing fake web pages in order to steal users’ confidential information. Besides spasm number of hazardous actions, there is a limited number of spam filtering software. According to this paper, XGBoost algorithm used for handling SMS spam detection problem. Number of structural features was collected from previous studies. 15 structural features were extracted from Tiago’s dataset, which is the most frequently used dataset by researchers. For selecting the optimal relevant features, two different types of wrapper feature selection algorithms were used in order to reduce and select best relevant features. The accuracy and performance obtained by the selected features via sequential backward selection method was better comparing to sequential forward selection method. The extracted nine optimal features can be a good representation of a spam SMS message. Additionally, the classification accuracy obtained by the proposed method using nine optimal features with XGBoost algorithm is 98.64 using 10-fold cross validation. https://kjar.spu.edu.iq/index.php/kjar/article/view/338SMS spam, wrapper methods, sequential feature selection, sequential forward selection, sequential backward selection, boosting classifier, extreme gradient boosting, XGBoost. |
spellingShingle | Diyari Jalal Mussa Noor Ghazi M. Jameel Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm Kurdistan Journal of Applied Research SMS spam, wrapper methods, sequential feature selection, sequential forward selection, sequential backward selection, boosting classifier, extreme gradient boosting, XGBoost. |
title | Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm |
title_full | Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm |
title_fullStr | Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm |
title_full_unstemmed | Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm |
title_short | Relevant SMS Spam Feature Selection Using Wrapper Approach and XGBoost Algorithm |
title_sort | relevant sms spam feature selection using wrapper approach and xgboost algorithm |
topic | SMS spam, wrapper methods, sequential feature selection, sequential forward selection, sequential backward selection, boosting classifier, extreme gradient boosting, XGBoost. |
url | https://kjar.spu.edu.iq/index.php/kjar/article/view/338 |
work_keys_str_mv | AT diyarijalalmussa relevantsmsspamfeatureselectionusingwrapperapproachandxgboostalgorithm AT noorghazimjameel relevantsmsspamfeatureselectionusingwrapperapproachandxgboostalgorithm |