Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach

Due to the massive adoption of mobile money in Sub-Saharan countries, the global transaction value of mobile money exceeded <inline-formula> <tex-math notation="LaTeX">$\$ $ </tex-math></inline-formula>2 billion in 2021. Projections show transaction values will exce...

Full description

Bibliographic Details
Main Authors: Iddi S. Mambina, Jema D. Ndibwile, Kisangiri F. Michael
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9849641/
_version_ 1828148918348677120
author Iddi S. Mambina
Jema D. Ndibwile
Kisangiri F. Michael
author_facet Iddi S. Mambina
Jema D. Ndibwile
Kisangiri F. Michael
author_sort Iddi S. Mambina
collection DOAJ
description Due to the massive adoption of mobile money in Sub-Saharan countries, the global transaction value of mobile money exceeded <inline-formula> <tex-math notation="LaTeX">$\$ $ </tex-math></inline-formula>2 billion in 2021. Projections show transaction values will exceed <inline-formula> <tex-math notation="LaTeX">$\$ $ </tex-math></inline-formula>3 billion by the end of 2022, and Sub-Saharan Africa contributes half of the daily transactions. SMS (Short Message Service) phishing cost corporations and individuals millions of dollars annually. Spammers use Smishing (SMS Phishing) messages to trick a mobile money user into sending electronic cash to an unintended mobile wallet. Though Smishing is an incarnation of phishing, they differ in the information available and attack strategy. As a result, detecting Smishing becomes difficult. Numerous models and techniques to detect Smishing attacks have been introduced for high-resource languages, yet few target low-resource languages such as Swahili. This study proposes a machine-learning based model to classify Swahili Smishing text messages targeting mobile money users. Experimental results show a hybrid model of Extratree classifier feature selection and Random Forest using TFIDF (Term Frequency Inverse Document Frequency) vectorization yields the best model with an accuracy score of 99.86&#x0025;. Results are measured against a baseline Multinomial Na&#x00EF;ve-Bayes model. In addition, comparison with a set of other classic classifiers is also done. The model returns the lowest false positive and false negative of 2 and 4, respectively, with a Log-Loss of 0.04. A Swahili dataset with 32259 messages is used for performance evaluation.
first_indexed 2024-04-11T21:22:34Z
format Article
id doaj.art-cab7e2d2ce9441ef983852e4b33e5024
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-11T21:22:34Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-cab7e2d2ce9441ef983852e4b33e50242022-12-22T04:02:34ZengIEEEIEEE Access2169-35362022-01-0110830618307410.1109/ACCESS.2022.31964649849641Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning ApproachIddi S. Mambina0https://orcid.org/0000-0003-1144-147XJema D. Ndibwile1https://orcid.org/0000-0002-7962-2237Kisangiri F. Michael2School of Computation and Communication Science and Engineering, The Nelson Mandela Institution of Science and Technology, Arusha, TanzaniaCollege of Engineering, Carnegie Mellon University Africa, Kigali, RwandaSchool of Computation and Communication Science and Engineering, The Nelson Mandela Institution of Science and Technology, Arusha, TanzaniaDue to the massive adoption of mobile money in Sub-Saharan countries, the global transaction value of mobile money exceeded <inline-formula> <tex-math notation="LaTeX">$\$ $ </tex-math></inline-formula>2 billion in 2021. Projections show transaction values will exceed <inline-formula> <tex-math notation="LaTeX">$\$ $ </tex-math></inline-formula>3 billion by the end of 2022, and Sub-Saharan Africa contributes half of the daily transactions. SMS (Short Message Service) phishing cost corporations and individuals millions of dollars annually. Spammers use Smishing (SMS Phishing) messages to trick a mobile money user into sending electronic cash to an unintended mobile wallet. Though Smishing is an incarnation of phishing, they differ in the information available and attack strategy. As a result, detecting Smishing becomes difficult. Numerous models and techniques to detect Smishing attacks have been introduced for high-resource languages, yet few target low-resource languages such as Swahili. This study proposes a machine-learning based model to classify Swahili Smishing text messages targeting mobile money users. Experimental results show a hybrid model of Extratree classifier feature selection and Random Forest using TFIDF (Term Frequency Inverse Document Frequency) vectorization yields the best model with an accuracy score of 99.86&#x0025;. Results are measured against a baseline Multinomial Na&#x00EF;ve-Bayes model. In addition, comparison with a set of other classic classifiers is also done. The model returns the lowest false positive and false negative of 2 and 4, respectively, with a Log-Loss of 0.04. A Swahili dataset with 32259 messages is used for performance evaluation.https://ieeexplore.ieee.org/document/9849641/Natural language processingmobile moneymachine-learningSMSSub-Saharan Africasocial engineering
spellingShingle Iddi S. Mambina
Jema D. Ndibwile
Kisangiri F. Michael
Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach
IEEE Access
Natural language processing
mobile money
machine-learning
SMS
Sub-Saharan Africa
social engineering
title Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach
title_full Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach
title_fullStr Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach
title_full_unstemmed Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach
title_short Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach
title_sort classifying swahili smishing attacks for mobile money users a machine learning approach
topic Natural language processing
mobile money
machine-learning
SMS
Sub-Saharan Africa
social engineering
url https://ieeexplore.ieee.org/document/9849641/
work_keys_str_mv AT iddismambina classifyingswahilismishingattacksformobilemoneyusersamachinelearningapproach
AT jemadndibwile classifyingswahilismishingattacksformobilemoneyusersamachinelearningapproach
AT kisangirifmichael classifyingswahilismishingattacksformobilemoneyusersamachinelearningapproach