Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model
The adoption of cloud computing has become increasingly widespread across various domains. However, the inherent security vulnerabilities of cloud computing pose significant risks to its overall safety. Consequently, intrusion detection systems (IDS) play a pivotal role in identifying malicious acti...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10388239/ |
_version_ | 1797348704772947968 |
---|---|
author | Mhamad Bakro Rakesh Ranjan Kumar Mohammad Husain Zubair Ashraf Arshad Ali Syed Irfan Yaqoob Mohammad Nadeem Ahmed Nikhat Parveen |
author_facet | Mhamad Bakro Rakesh Ranjan Kumar Mohammad Husain Zubair Ashraf Arshad Ali Syed Irfan Yaqoob Mohammad Nadeem Ahmed Nikhat Parveen |
author_sort | Mhamad Bakro |
collection | DOAJ |
description | The adoption of cloud computing has become increasingly widespread across various domains. However, the inherent security vulnerabilities of cloud computing pose significant risks to its overall safety. Consequently, intrusion detection systems (IDS) play a pivotal role in identifying malicious activities within a cloud system. The considerable volume of network traffic data may contain redundant and irrelevant features that can impact the classification performance of the classifier. In addition, the complexity and time consumption increase while processing such a substantial volume of data in the cloud intrusion detection process. To enhance the performance of the IDS, this study proposes a hybrid feature selection approach, combining two bio-inspired algorithms, namely the grasshopper optimization algorithm (GOA) and the genetic algorithm (GA). The combination of these two algorithms ensures a more efficient search for optimal solutions. A random forest (RF) classifier is trained using those optimal features. Moreover, the proposal addresses the challenge of imbalanced data by employing a hybrid approach: over-sampling the minority classes using an adaptive synthetic (ADASYN) algorithm, while implementing random under-sampling (RUS) for the majority class as needed. This integrated strategy significantly influences each category, enhancing the true positive rate (TPR) while minimizing the false positive rate (FPR), thus improving the overall system performance. The proposed approach was evaluated using three datasets: UNSW-NB15, CIC-DDoS2019, and CIC Bell DNS EXF 2021. The recorded accuracies for these datasets were 98%, 99%, and 92%, respectively. The hybrid feature selection-based IDS demonstrated superior performance in multi-class classification, along with exemplary results for individual classes within the datasets. The proposed strategy exhibited a marked superiority with the random forest classifier, especially when compared to other classifiers including SVM, LR, FLN, LSTM, AlexNet, DNN, DBN, DT, and XGBoost. Moreover, this performance remained consistent and commendable even when benchmarked against contemporary state-of-the-art methodologies across multiple evaluation metrics. |
first_indexed | 2024-03-08T12:09:49Z |
format | Article |
id | doaj.art-94c000cbe98249dc8b24c43a4c49f02e |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-08T12:09:49Z |
publishDate | 2024-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-94c000cbe98249dc8b24c43a4c49f02e2024-01-23T00:06:20ZengIEEEIEEE Access2169-35362024-01-01128846887410.1109/ACCESS.2024.335305510388239Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest ModelMhamad Bakro0https://orcid.org/0000-0003-1446-5127Rakesh Ranjan Kumar1Mohammad Husain2Zubair Ashraf3https://orcid.org/0000-0001-7122-2856Arshad Ali4Syed Irfan Yaqoob5https://orcid.org/0000-0001-8316-4892Mohammad Nadeem Ahmed6https://orcid.org/0000-0003-1602-0770Nikhat Parveen7Department of Computer Science and Engineering, C. V. Raman Global University, Odisha, Bhubaneswar, IndiaDepartment of Computer Science and Engineering, C. V. Raman Global University, Odisha, Bhubaneswar, IndiaDepartment of Computer Science, Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah, Saudi ArabiaDepartment of Computer Engineering and Applications, GLA University, Mathura, Uttar Pradesh, IndiaDepartment of Computer Science, Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah, Saudi ArabiaDepartment of Computer Science and Applications, Dr. Vishwanath Karad MIT World Peace University, Pune, IndiaDepartment of Computer Science, King Khalid University, Abha, Saudi ArabiaDepartment of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur, Andhra Pradesh, IndiaThe adoption of cloud computing has become increasingly widespread across various domains. However, the inherent security vulnerabilities of cloud computing pose significant risks to its overall safety. Consequently, intrusion detection systems (IDS) play a pivotal role in identifying malicious activities within a cloud system. The considerable volume of network traffic data may contain redundant and irrelevant features that can impact the classification performance of the classifier. In addition, the complexity and time consumption increase while processing such a substantial volume of data in the cloud intrusion detection process. To enhance the performance of the IDS, this study proposes a hybrid feature selection approach, combining two bio-inspired algorithms, namely the grasshopper optimization algorithm (GOA) and the genetic algorithm (GA). The combination of these two algorithms ensures a more efficient search for optimal solutions. A random forest (RF) classifier is trained using those optimal features. Moreover, the proposal addresses the challenge of imbalanced data by employing a hybrid approach: over-sampling the minority classes using an adaptive synthetic (ADASYN) algorithm, while implementing random under-sampling (RUS) for the majority class as needed. This integrated strategy significantly influences each category, enhancing the true positive rate (TPR) while minimizing the false positive rate (FPR), thus improving the overall system performance. The proposed approach was evaluated using three datasets: UNSW-NB15, CIC-DDoS2019, and CIC Bell DNS EXF 2021. The recorded accuracies for these datasets were 98%, 99%, and 92%, respectively. The hybrid feature selection-based IDS demonstrated superior performance in multi-class classification, along with exemplary results for individual classes within the datasets. The proposed strategy exhibited a marked superiority with the random forest classifier, especially when compared to other classifiers including SVM, LR, FLN, LSTM, AlexNet, DNN, DBN, DT, and XGBoost. Moreover, this performance remained consistent and commendable even when benchmarked against contemporary state-of-the-art methodologies across multiple evaluation metrics.https://ieeexplore.ieee.org/document/10388239/Hybrid metaheuristic approachGOA-GA-based feature selectionUNSW-NB15CIC-DDoS2019CIC Bell DNS EXF 2021 |
spellingShingle | Mhamad Bakro Rakesh Ranjan Kumar Mohammad Husain Zubair Ashraf Arshad Ali Syed Irfan Yaqoob Mohammad Nadeem Ahmed Nikhat Parveen Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model IEEE Access Hybrid metaheuristic approach GOA-GA-based feature selection UNSW-NB15 CIC-DDoS2019 CIC Bell DNS EXF 2021 |
title | Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model |
title_full | Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model |
title_fullStr | Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model |
title_full_unstemmed | Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model |
title_short | Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model |
title_sort | building a cloud ids by hybrid bio inspired feature selection algorithms along with random forest model |
topic | Hybrid metaheuristic approach GOA-GA-based feature selection UNSW-NB15 CIC-DDoS2019 CIC Bell DNS EXF 2021 |
url | https://ieeexplore.ieee.org/document/10388239/ |
work_keys_str_mv | AT mhamadbakro buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel AT rakeshranjankumar buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel AT mohammadhusain buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel AT zubairashraf buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel AT arshadali buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel AT syedirfanyaqoob buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel AT mohammadnadeemahmed buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel AT nikhatparveen buildingacloudidsbyhybridbioinspiredfeatureselectionalgorithmsalongwithrandomforestmodel |