Phishing hybrid feature-based classifier by using recursive features subset selection and machine learning algorithms

Machine learning classifiers enriched the anti-phishing schemes with effective phishing classification models. However, they were constrained by their deficiency of inductive factors like learning on big and imbalanced data, deploying rich sets of features, and learning classifiers actively. That re...

Full description

Bibliographic Details
Main Authors: Zuhair, H., Selamat, A.
Format: Conference or Workshop Item
Language:English
Published: 2019
Subjects:
Online Access:http://eprints.utm.my/88940/1/HibaZuhair2019_PhishingHybridFeatureBasedClassifier.pdf
Description
Summary:Machine learning classifiers enriched the anti-phishing schemes with effective phishing classification models. However, they were constrained by their deficiency of inductive factors like learning on big and imbalanced data, deploying rich sets of features, and learning classifiers actively. That resulted in heavyweight phishing classifiers with massive misclassifications in real-time phishing detection. To diminish this deficiency, this paper proposed a new Phishing Hybrid Feature-Based Classifier (PHFBC) which hybridized two machine learning algorithms (Naïve Base) and (Decision Tree) with a statistical criterion of Phish Ratio. In conjunction, a Recursive Feature Subset Selection Algorithm (RFSSA) was also proposed to characterize phishing holistically with a robust selected subset of features. Outcomes of performance assessment via simulations, real-time validation, and comparative analysis demonstrated that PHFBC was highly distinctive among its competitors in terms of classification accuracy and minimal misclassification of novel phishes on the Web.