Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach
Due to the massive adoption of mobile money in Sub-Saharan countries, the global transaction value of mobile money exceeded <inline-formula> <tex-math notation="LaTeX">$\$ $ </tex-math></inline-formula>2 billion in 2021. Projections show transaction values will exce...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9849641/ |
_version_ | 1828148918348677120 |
---|---|
author | Iddi S. Mambina Jema D. Ndibwile Kisangiri F. Michael |
author_facet | Iddi S. Mambina Jema D. Ndibwile Kisangiri F. Michael |
author_sort | Iddi S. Mambina |
collection | DOAJ |
description | Due to the massive adoption of mobile money in Sub-Saharan countries, the global transaction value of mobile money exceeded <inline-formula> <tex-math notation="LaTeX">$\$ $ </tex-math></inline-formula>2 billion in 2021. Projections show transaction values will exceed <inline-formula> <tex-math notation="LaTeX">$\$ $ </tex-math></inline-formula>3 billion by the end of 2022, and Sub-Saharan Africa contributes half of the daily transactions. SMS (Short Message Service) phishing cost corporations and individuals millions of dollars annually. Spammers use Smishing (SMS Phishing) messages to trick a mobile money user into sending electronic cash to an unintended mobile wallet. Though Smishing is an incarnation of phishing, they differ in the information available and attack strategy. As a result, detecting Smishing becomes difficult. Numerous models and techniques to detect Smishing attacks have been introduced for high-resource languages, yet few target low-resource languages such as Swahili. This study proposes a machine-learning based model to classify Swahili Smishing text messages targeting mobile money users. Experimental results show a hybrid model of Extratree classifier feature selection and Random Forest using TFIDF (Term Frequency Inverse Document Frequency) vectorization yields the best model with an accuracy score of 99.86%. Results are measured against a baseline Multinomial Naïve-Bayes model. In addition, comparison with a set of other classic classifiers is also done. The model returns the lowest false positive and false negative of 2 and 4, respectively, with a Log-Loss of 0.04. A Swahili dataset with 32259 messages is used for performance evaluation. |
first_indexed | 2024-04-11T21:22:34Z |
format | Article |
id | doaj.art-cab7e2d2ce9441ef983852e4b33e5024 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-11T21:22:34Z |
publishDate | 2022-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-cab7e2d2ce9441ef983852e4b33e50242022-12-22T04:02:34ZengIEEEIEEE Access2169-35362022-01-0110830618307410.1109/ACCESS.2022.31964649849641Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning ApproachIddi S. Mambina0https://orcid.org/0000-0003-1144-147XJema D. Ndibwile1https://orcid.org/0000-0002-7962-2237Kisangiri F. Michael2School of Computation and Communication Science and Engineering, The Nelson Mandela Institution of Science and Technology, Arusha, TanzaniaCollege of Engineering, Carnegie Mellon University Africa, Kigali, RwandaSchool of Computation and Communication Science and Engineering, The Nelson Mandela Institution of Science and Technology, Arusha, TanzaniaDue to the massive adoption of mobile money in Sub-Saharan countries, the global transaction value of mobile money exceeded <inline-formula> <tex-math notation="LaTeX">$\$ $ </tex-math></inline-formula>2 billion in 2021. Projections show transaction values will exceed <inline-formula> <tex-math notation="LaTeX">$\$ $ </tex-math></inline-formula>3 billion by the end of 2022, and Sub-Saharan Africa contributes half of the daily transactions. SMS (Short Message Service) phishing cost corporations and individuals millions of dollars annually. Spammers use Smishing (SMS Phishing) messages to trick a mobile money user into sending electronic cash to an unintended mobile wallet. Though Smishing is an incarnation of phishing, they differ in the information available and attack strategy. As a result, detecting Smishing becomes difficult. Numerous models and techniques to detect Smishing attacks have been introduced for high-resource languages, yet few target low-resource languages such as Swahili. This study proposes a machine-learning based model to classify Swahili Smishing text messages targeting mobile money users. Experimental results show a hybrid model of Extratree classifier feature selection and Random Forest using TFIDF (Term Frequency Inverse Document Frequency) vectorization yields the best model with an accuracy score of 99.86%. Results are measured against a baseline Multinomial Naïve-Bayes model. In addition, comparison with a set of other classic classifiers is also done. The model returns the lowest false positive and false negative of 2 and 4, respectively, with a Log-Loss of 0.04. A Swahili dataset with 32259 messages is used for performance evaluation.https://ieeexplore.ieee.org/document/9849641/Natural language processingmobile moneymachine-learningSMSSub-Saharan Africasocial engineering |
spellingShingle | Iddi S. Mambina Jema D. Ndibwile Kisangiri F. Michael Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach IEEE Access Natural language processing mobile money machine-learning SMS Sub-Saharan Africa social engineering |
title | Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach |
title_full | Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach |
title_fullStr | Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach |
title_full_unstemmed | Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach |
title_short | Classifying Swahili Smishing Attacks for Mobile Money Users: A Machine-Learning Approach |
title_sort | classifying swahili smishing attacks for mobile money users a machine learning approach |
topic | Natural language processing mobile money machine-learning SMS Sub-Saharan Africa social engineering |
url | https://ieeexplore.ieee.org/document/9849641/ |
work_keys_str_mv | AT iddismambina classifyingswahilismishingattacksformobilemoneyusersamachinelearningapproach AT jemadndibwile classifyingswahilismishingattacksformobilemoneyusersamachinelearningapproach AT kisangirifmichael classifyingswahilismishingattacksformobilemoneyusersamachinelearningapproach |