Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model

The adoption of cloud computing has become increasingly widespread across various domains. However, the inherent security vulnerabilities of cloud computing pose significant risks to its overall safety. Consequently, intrusion detection systems (IDS) play a pivotal role in identifying malicious acti...

Full description

Bibliographic Details
Main Authors: Mhamad Bakro, Rakesh Ranjan Kumar, Mohammad Husain, Zubair Ashraf, Arshad Ali, Syed Irfan Yaqoob, Mohammad Nadeem Ahmed, Nikhat Parveen
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10388239/
_version_ 1797348704772947968
author Mhamad Bakro
Rakesh Ranjan Kumar
Mohammad Husain
Zubair Ashraf
Arshad Ali
Syed Irfan Yaqoob
Mohammad Nadeem Ahmed
Nikhat Parveen
author_facet Mhamad Bakro
Rakesh Ranjan Kumar
Mohammad Husain
Zubair Ashraf
Arshad Ali
Syed Irfan Yaqoob
Mohammad Nadeem Ahmed
Nikhat Parveen
author_sort Mhamad Bakro
collection DOAJ
description The adoption of cloud computing has become increasingly widespread across various domains. However, the inherent security vulnerabilities of cloud computing pose significant risks to its overall safety. Consequently, intrusion detection systems (IDS) play a pivotal role in identifying malicious activities within a cloud system. The considerable volume of network traffic data may contain redundant and irrelevant features that can impact the classification performance of the classifier. In addition, the complexity and time consumption increase while processing such a substantial volume of data in the cloud intrusion detection process. To enhance the performance of the IDS, this study proposes a hybrid feature selection approach, combining two bio-inspired algorithms, namely the grasshopper optimization algorithm (GOA) and the genetic algorithm (GA). The combination of these two algorithms ensures a more efficient search for optimal solutions. A random forest (RF) classifier is trained using those optimal features. Moreover, the proposal addresses the challenge of imbalanced data by employing a hybrid approach: over-sampling the minority classes using an adaptive synthetic (ADASYN) algorithm, while implementing random under-sampling (RUS) for the majority class as needed. This integrated strategy significantly influences each category, enhancing the true positive rate (TPR) while minimizing the false positive rate (FPR), thus improving the overall system performance. The proposed approach was evaluated using three datasets: UNSW-NB15, CIC-DDoS2019, and CIC Bell DNS EXF 2021. The recorded accuracies for these datasets were 98%, 99%, and 92%, respectively. The hybrid feature selection-based IDS demonstrated superior performance in multi-class classification, along with exemplary results for individual classes within the datasets. The proposed strategy exhibited a marked superiority with the random forest classifier, especially when compared to other classifiers including SVM, LR, FLN, LSTM, AlexNet, DNN, DBN, DT, and XGBoost. Moreover, this performance remained consistent and commendable even when benchmarked against contemporary state-of-the-art methodologies across multiple evaluation metrics.
first_indexed 2024-03-08T12:09:49Z
format Article
id doaj.art-94c000cbe98249dc8b24c43a4c49f02e
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-08T12:09:49Z
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-94c000cbe98249dc8b24c43a4c49f02e2024-01-23T00:06:20ZengIEEEIEEE Access2169-35362024-01-01128846887410.1109/ACCESS.2024.335305510388239Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest ModelMhamad Bakro0https://orcid.org/0000-0003-1446-5127Rakesh Ranjan Kumar1Mohammad Husain2Zubair Ashraf3https://orcid.org/0000-0001-7122-2856Arshad Ali4Syed Irfan Yaqoob5https://orcid.org/0000-0001-8316-4892Mohammad Nadeem Ahmed6https://orcid.org/0000-0003-1602-0770Nikhat Parveen7Department of Computer Science and Engineering, C. V. Raman Global University, Odisha, Bhubaneswar, IndiaDepartment of Computer Science and Engineering, C. V. Raman Global University, Odisha, Bhubaneswar, IndiaDepartment of Computer Science, Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah, Saudi ArabiaDepartment of Computer Engineering and Applications, GLA University, Mathura, Uttar Pradesh, IndiaDepartment of Computer Science, Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah, Saudi ArabiaDepartment of Computer Science and Applications, Dr. Vishwanath Karad MIT World Peace University, Pune, IndiaDepartment of Computer Science, King Khalid University, Abha, Saudi ArabiaDepartment of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur, Andhra Pradesh, IndiaThe adoption of cloud computing has become increasingly widespread across various domains. However, the inherent security vulnerabilities of cloud computing pose significant risks to its overall safety. Consequently, intrusion detection systems (IDS) play a pivotal role in identifying malicious activities within a cloud system. The considerable volume of network traffic data may contain redundant and irrelevant features that can impact the classification performance of the classifier. In addition, the complexity and time consumption increase while processing such a substantial volume of data in the cloud intrusion detection process. To enhance the performance of the IDS, this study proposes a hybrid feature selection approach, combining two bio-inspired algorithms, namely the grasshopper optimization algorithm (GOA) and the genetic algorithm (GA). The combination of these two algorithms ensures a more efficient search for optimal solutions. A random forest (RF) classifier is trained using those optimal features. Moreover, the proposal addresses the challenge of imbalanced data by employing a hybrid approach: over-sampling the minority classes using an adaptive synthetic (ADASYN) algorithm, while implementing random under-sampling (RUS) for the majority class as needed. This integrated strategy significantly influences each category, enhancing the true positive rate (TPR) while minimizing the false positive rate (FPR), thus improving the overall system performance. The proposed approach was evaluated using three datasets: UNSW-NB15, CIC-DDoS2019, and CIC Bell DNS EXF 2021. The recorded accuracies for these datasets were 98%, 99%, and 92%, respectively. The hybrid feature selection-based IDS demonstrated superior performance in multi-class classification, along with exemplary results for individual classes within the datasets. The proposed strategy exhibited a marked superiority with the random forest classifier, especially when compared to other classifiers including SVM, LR, FLN, LSTM, AlexNet, DNN, DBN, DT, and XGBoost. Moreover, this performance remained consistent and commendable even when benchmarked against contemporary state-of-the-art methodologies across multiple evaluation metrics.https://ieeexplore.ieee.org/document/10388239/Hybrid metaheuristic approachGOA-GA-based feature selectionUNSW-NB15CIC-DDoS2019CIC Bell DNS EXF 2021
spellingShingle Mhamad Bakro
Rakesh Ranjan Kumar
Mohammad Husain
Zubair Ashraf
Arshad Ali
Syed Irfan Yaqoob
Mohammad Nadeem Ahmed
Nikhat Parveen
Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model
IEEE Access
Hybrid metaheuristic approach
GOA-GA-based feature selection
UNSW-NB15
CIC-DDoS2019
CIC Bell DNS EXF 2021
title Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model
title_full Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model
title_fullStr Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model
title_full_unstemmed Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model
title_short Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model
title_sort building a cloud ids by hybrid bio inspired feature selection algorithms along with random forest model
topic Hybrid metaheuristic approach
GOA-GA-based feature selection
UNSW-NB15
CIC-DDoS2019
CIC Bell DNS EXF 2021
url https://ieeexplore.ieee.org/document/10388239/
work_keys_str_mv AT mhamadbakro buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel
AT rakeshranjankumar buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel
AT mohammadhusain buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel
AT zubairashraf buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel
AT arshadali buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel
AT syedirfanyaqoob buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel
AT mohammadnadeemahmed buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel
AT nikhatparveen buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel