Hybrid Approach for Phishing Website Detection Using Classification Algorithms

The internet has significantly altered how we work and interact with one another.Statistics show 63.1 % of the present world population are internet users. This clearly indicates how heavily man is dependent on digital media. Digital media users are on the rise and so is the incidence of cyber cri...

Full description

Bibliographic Details
Main Authors:	Mukta Mithra Raj, J. Angel Arul Jothi
Format:	Article
Language:	English
Published:	ITI Research Group 2022-12-01
Series:	ParadigmPlus
Subjects:	URL Features Data Mining Machine Learning Hybrid Classification Algorithms Phishing Website Detection
Online Access:	https://journals.itiud.org/index.php/paradigmplus/article/view/39

_version_	1811254557374676992
author	Mukta Mithra Raj J. Angel Arul Jothi
author_facet	Mukta Mithra Raj J. Angel Arul Jothi
author_sort	Mukta Mithra Raj
collection	DOAJ
description	The internet has significantly altered how we work and interact with one another.Statistics show 63.1 % of the present world population are internet users. This clearly indicates how heavily man is dependent on digital media. Digital media users are on the rise and so is the incidence of cyber crimes. People who lack experience and knowledge are more vulnerable and susceptible to phishing scams.The victims experience severe consequences as their personal credentials are at stake. Phishers use publicly available sources to acquire details about the victim's professional and personal history.Countermeasures must be implemented with the highest priority. Detection of malicious websites can significantly reduce the risk of phishing attempts.In this research, a highly accurate website phishing detection method based on URL features is proposed. We investigated eight existing machine learning classification techniques for this, including extreme gradient boosting (XGBoost), random forest (RF), adaptive boosting (AdaBoost), decision trees (DT), K-nearest neighbors (KNN), support vector machines (SVM), logistic regression and naïve bayes (NB) to detect malicious websites.The results show that XGboost had the best accuracy with a score of 96.71%, followed by random forest and AdaBoost.We further experimented with various hybrid combinations of the top three classifiers and observed that XGboost-Random Forest hybrid algorithms produced the best results.The hybrid model classified the websites as legitimate or phishing with an accuracy of 97.07%.
first_indexed	2024-04-12T17:09:11Z
format	Article
id	doaj.art-8e225217f3404f898616367aee6a0017
institution	Directory Open Access Journal
issn	2711-4627
language	English
last_indexed	2024-04-12T17:09:11Z
publishDate	2022-12-01
publisher	ITI Research Group
record_format	Article
series	ParadigmPlus
spelling	doaj.art-8e225217f3404f898616367aee6a00172022-12-22T03:23:51ZengITI Research GroupParadigmPlus2711-46272022-12-013310.55969/paradigmplus.v3n3a2Hybrid Approach for Phishing Website Detection Using Classification AlgorithmsMukta Mithra Raj0J. Angel Arul Jothi1Birla Institute of Technology and Science Pilani, United Arab EmiratesBirla Institute of Technology and Science Pilani, United Arab Emirates The internet has significantly altered how we work and interact with one another.Statistics show 63.1 % of the present world population are internet users. This clearly indicates how heavily man is dependent on digital media. Digital media users are on the rise and so is the incidence of cyber crimes. People who lack experience and knowledge are more vulnerable and susceptible to phishing scams.The victims experience severe consequences as their personal credentials are at stake. Phishers use publicly available sources to acquire details about the victim's professional and personal history.Countermeasures must be implemented with the highest priority. Detection of malicious websites can significantly reduce the risk of phishing attempts.In this research, a highly accurate website phishing detection method based on URL features is proposed. We investigated eight existing machine learning classification techniques for this, including extreme gradient boosting (XGBoost), random forest (RF), adaptive boosting (AdaBoost), decision trees (DT), K-nearest neighbors (KNN), support vector machines (SVM), logistic regression and naïve bayes (NB) to detect malicious websites.The results show that XGboost had the best accuracy with a score of 96.71%, followed by random forest and AdaBoost.We further experimented with various hybrid combinations of the top three classifiers and observed that XGboost-Random Forest hybrid algorithms produced the best results.The hybrid model classified the websites as legitimate or phishing with an accuracy of 97.07%. https://journals.itiud.org/index.php/paradigmplus/article/view/39URL FeaturesData MiningMachine LearningHybrid Classification AlgorithmsPhishing Website Detection
spellingShingle	Mukta Mithra Raj J. Angel Arul Jothi Hybrid Approach for Phishing Website Detection Using Classification Algorithms ParadigmPlus URL Features Data Mining Machine Learning Hybrid Classification Algorithms Phishing Website Detection
title	Hybrid Approach for Phishing Website Detection Using Classification Algorithms
title_full	Hybrid Approach for Phishing Website Detection Using Classification Algorithms
title_fullStr	Hybrid Approach for Phishing Website Detection Using Classification Algorithms
title_full_unstemmed	Hybrid Approach for Phishing Website Detection Using Classification Algorithms
title_short	Hybrid Approach for Phishing Website Detection Using Classification Algorithms
title_sort	hybrid approach for phishing website detection using classification algorithms
topic	URL Features Data Mining Machine Learning Hybrid Classification Algorithms Phishing Website Detection
url	https://journals.itiud.org/index.php/paradigmplus/article/view/39
work_keys_str_mv	AT muktamithraraj hybridapproachforphishingwebsitedetectionusingclassificationalgorithms AT jangelaruljothi hybridapproachforphishingwebsitedetectionusingclassificationalgorithms

Hybrid Approach for Phishing Website Detection Using Classification Algorithms

Similar Items