An Efficient Hybrid Feature Selection Technique Toward Prediction of Suspicious URLs in IoT Environment

With the growth of IoT, a vast number of devices are connected to the web. Consequently, both users and devices are susceptible to deception by intruders through malicious links leading to the disclosure of personal information. Hence, it is essential to identify suspicious URLs before accessing the...

Full description

Bibliographic Details
Main Authors: Sanjukta Mohanty, Arup Abhinna Acharya, Tarek Gaber, Namita Panda, Esraa Eldesouky, Ibrahim A. Hameed
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10489965/
_version_ 1797206146301296640
author Sanjukta Mohanty
Arup Abhinna Acharya
Tarek Gaber
Namita Panda
Esraa Eldesouky
Ibrahim A. Hameed
author_facet Sanjukta Mohanty
Arup Abhinna Acharya
Tarek Gaber
Namita Panda
Esraa Eldesouky
Ibrahim A. Hameed
author_sort Sanjukta Mohanty
collection DOAJ
description With the growth of IoT, a vast number of devices are connected to the web. Consequently, both users and devices are susceptible to deception by intruders through malicious links leading to the disclosure of personal information. Hence, it is essential to identify suspicious URLs before accessing them. While numerous researchers have proposed several URL detection approaches, the machine learning-based technique stands out as particularly effective because of its ability to detect zero-day attacks; however, its success depends on the type and dimension of features utilized. In earlier research, only the lexical features of URLs were employed for classification to attain high detection speeds. However, this approach does not allow for the retrieval of comprehensive information about a website. Hence, to enhance the security of IoT devices, both lexical and page content-based features of URLs must be considered. To improve the performance of the model, the researchers extract informative features using different Feature Selection Techniques (FSTs), including filter and wrapper methods. However, challenges such as the demand for more resources, time, and handling of high-dimensional datasets encountered by individual FSTs have driven the development of hybrid FSTs. Nevertheless, the combination of a filter-based FST and a wrapper search-based Genetic Algorithm (GA) is used in the identification of malicious URLs as well as the detection of malicious links in the IoT devices research studies. Therefore, the proposed approach leverages the advantages of a variety of features and explores a hybrid FST to produce the optimal feature subset to evaluate the boosting estimators with specific hyperparameter configurations. Our proposed approach effectively fills the research gap associated with previous methodologies research 99% while keeping the computational costs minimal, making it suitable for resource-constrained devices in detecting malignant URLs.
first_indexed 2024-04-24T09:02:22Z
format Article
id doaj.art-ee6c72b3ea4048229df327e1db4c1a6e
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-24T09:02:22Z
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-ee6c72b3ea4048229df327e1db4c1a6e2024-04-15T23:00:32ZengIEEEIEEE Access2169-35362024-01-0112505785059410.1109/ACCESS.2024.338484010489965An Efficient Hybrid Feature Selection Technique Toward Prediction of Suspicious URLs in IoT EnvironmentSanjukta Mohanty0Arup Abhinna Acharya1Tarek Gaber2https://orcid.org/0000-0003-4065-4191Namita Panda3Esraa Eldesouky4Ibrahim A. Hameed5https://orcid.org/0000-0003-1252-260XSchool of Computer Engineering, KIIT Deemed to be University, Bhubaneswar, Odisha, IndiaSchool of Computer Engineering, KIIT Deemed to be University, Bhubaneswar, Odisha, IndiaSchool of Science, Engineering, and Environment, University of Salford, Salford, U.K.School of Computer Engineering, KIIT Deemed to be University, Bhubaneswar, Odisha, IndiaDepartment of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi ArabiaDepartment of ICT and Natural Sciences, Faculty of Information Technology and Electrical Engineering, Norwegian University of Science and Technology, Ålesund, NorwayWith the growth of IoT, a vast number of devices are connected to the web. Consequently, both users and devices are susceptible to deception by intruders through malicious links leading to the disclosure of personal information. Hence, it is essential to identify suspicious URLs before accessing them. While numerous researchers have proposed several URL detection approaches, the machine learning-based technique stands out as particularly effective because of its ability to detect zero-day attacks; however, its success depends on the type and dimension of features utilized. In earlier research, only the lexical features of URLs were employed for classification to attain high detection speeds. However, this approach does not allow for the retrieval of comprehensive information about a website. Hence, to enhance the security of IoT devices, both lexical and page content-based features of URLs must be considered. To improve the performance of the model, the researchers extract informative features using different Feature Selection Techniques (FSTs), including filter and wrapper methods. However, challenges such as the demand for more resources, time, and handling of high-dimensional datasets encountered by individual FSTs have driven the development of hybrid FSTs. Nevertheless, the combination of a filter-based FST and a wrapper search-based Genetic Algorithm (GA) is used in the identification of malicious URLs as well as the detection of malicious links in the IoT devices research studies. Therefore, the proposed approach leverages the advantages of a variety of features and explores a hybrid FST to produce the optimal feature subset to evaluate the boosting estimators with specific hyperparameter configurations. Our proposed approach effectively fills the research gap associated with previous methodologies research 99% while keeping the computational costs minimal, making it suitable for resource-constrained devices in detecting malignant URLs.https://ieeexplore.ieee.org/document/10489965/Boosting estimatorsfeature selection technique (FSTs)genetic algorithm (GA)Internet of Things (IoT)suspicious URLs
spellingShingle Sanjukta Mohanty
Arup Abhinna Acharya
Tarek Gaber
Namita Panda
Esraa Eldesouky
Ibrahim A. Hameed
An Efficient Hybrid Feature Selection Technique Toward Prediction of Suspicious URLs in IoT Environment
IEEE Access
Boosting estimators
feature selection technique (FSTs)
genetic algorithm (GA)
Internet of Things (IoT)
suspicious URLs
title An Efficient Hybrid Feature Selection Technique Toward Prediction of Suspicious URLs in IoT Environment
title_full An Efficient Hybrid Feature Selection Technique Toward Prediction of Suspicious URLs in IoT Environment
title_fullStr An Efficient Hybrid Feature Selection Technique Toward Prediction of Suspicious URLs in IoT Environment
title_full_unstemmed An Efficient Hybrid Feature Selection Technique Toward Prediction of Suspicious URLs in IoT Environment
title_short An Efficient Hybrid Feature Selection Technique Toward Prediction of Suspicious URLs in IoT Environment
title_sort efficient hybrid feature selection technique toward prediction of suspicious urls in iot environment
topic Boosting estimators
feature selection technique (FSTs)
genetic algorithm (GA)
Internet of Things (IoT)
suspicious URLs
url https://ieeexplore.ieee.org/document/10489965/
work_keys_str_mv AT sanjuktamohanty anefficienthybridfeatureselectiontechniquetowardpredictionofsuspiciousurlsiniotenvironment
AT arupabhinnaacharya anefficienthybridfeatureselectiontechniquetowardpredictionofsuspiciousurlsiniotenvironment
AT tarekgaber anefficienthybridfeatureselectiontechniquetowardpredictionofsuspiciousurlsiniotenvironment
AT namitapanda anefficienthybridfeatureselectiontechniquetowardpredictionofsuspiciousurlsiniotenvironment
AT esraaeldesouky anefficienthybridfeatureselectiontechniquetowardpredictionofsuspiciousurlsiniotenvironment
AT ibrahimahameed anefficienthybridfeatureselectiontechniquetowardpredictionofsuspiciousurlsiniotenvironment
AT sanjuktamohanty efficienthybridfeatureselectiontechniquetowardpredictionofsuspiciousurlsiniotenvironment
AT arupabhinnaacharya efficienthybridfeatureselectiontechniquetowardpredictionofsuspiciousurlsiniotenvironment
AT tarekgaber efficienthybridfeatureselectiontechniquetowardpredictionofsuspiciousurlsiniotenvironment
AT namitapanda efficienthybridfeatureselectiontechniquetowardpredictionofsuspiciousurlsiniotenvironment
AT esraaeldesouky efficienthybridfeatureselectiontechniquetowardpredictionofsuspiciousurlsiniotenvironment
AT ibrahimahameed efficienthybridfeatureselectiontechniquetowardpredictionofsuspiciousurlsiniotenvironment