Enhanced intrusion detection model based on principal component analysis and variable ensemble machine learning algorithm

The intrusion detection system (IDS) model, which can identify the presence of intruders in the network and take some predefined action for safe data transit across the network, is advantageous in achieving security in both simple and advanced network systems. Several IDS models have various securit...

Full description

Bibliographic Details
Main Authors: John, Ayuba, Isnin, Ismail Fauzi, Madni, Syed Hamid Hussain, Muchtar, Farkhana
Format: Article
Language:English
Published: Elsevier B.V. 2024
Subjects:
Online Access:http://eprints.utm.my/109022/1/FarkhanaMuchtar2024_EnhancedInstrusionDetectionModelBasedonPrincipal.pdf
_version_ 1824452085659205632
author John, Ayuba
Isnin, Ismail Fauzi
Madni, Syed Hamid Hussain
Muchtar, Farkhana
author_facet John, Ayuba
Isnin, Ismail Fauzi
Madni, Syed Hamid Hussain
Muchtar, Farkhana
author_sort John, Ayuba
collection ePrints
description The intrusion detection system (IDS) model, which can identify the presence of intruders in the network and take some predefined action for safe data transit across the network, is advantageous in achieving security in both simple and advanced network systems. Several IDS models have various security problems, such as low detection accuracy and high false alarms, which can be caused by the network traffic dataset's excessive dimensionality and class imbalance in the creation of IDS models. Principal Component Analysis (PCA) has proven to be a helpful feature selection technique for dimensionality reduction. As a result, because it is a linear transformation, it has challenges capturing non-linear relationships between feature properties in the network traffic datasets. This paper proposes a variable ensemble machine learning method to solve the problem and achieve a low variance model with high accuracy and low false alarm. First, PCA is combined with the AdaBoost ensemble machine learning algorithm, which acts as stagewise additive modelling to compensate for PCA's deficiency in feature selection in network traffic by minimizing the exponential loss function. Secondly, PCA is used for feature selection, and a LogitBoost classifier algorithm can be used for multiclass classification and acts as an additive tree regression to compensate for the PCA's weakness by minimizing the Logistic Loss to provide an optimal classifier output. Finally, the low variance ability of RandomForest, which employs the bagging approach, is applied to eliminate overfittings. The experiments of the IDS model developed from the proposed methods were evaluated on the WSN-DS, NSL-KDD, and UNSW-N15 datasets. The performance of the methods, PCA with AdaBoost, on the WSN-DS dataset has an accuracy score of 92.3 %, an 89.0 % accuracy score on the NSL-KDD dataset, and a 67.9 % accuracy score on UNSW-N15, which is the least accurate score. PCA and RandomForest surpassed them by scoring 100 % accuracy on all three datasets. PCA and Bagging have an accuracy score of 99.8 % on the WSN-DS dataset, 100 % on the NSL-KDD dataset, and 93.4 % on the UNSW-N15 dataset. In comparison, PCA and LogitBoost have an accuracy score of 98.9 % on the WSN-DS dataset, 100 % on the NSL-KDD dataset, and 88.7 % on the UNSW-N15 dataset.
first_indexed 2025-02-19T02:44:55Z
format Article
id utm.eprints-109022
institution Universiti Teknologi Malaysia - ePrints
language English
last_indexed 2025-02-19T02:44:55Z
publishDate 2024
publisher Elsevier B.V.
record_format dspace
spelling utm.eprints-1090222025-01-27T08:36:52Z http://eprints.utm.my/109022/ Enhanced intrusion detection model based on principal component analysis and variable ensemble machine learning algorithm John, Ayuba Isnin, Ismail Fauzi Madni, Syed Hamid Hussain Muchtar, Farkhana T Technology (General) T58.6-58.62 Management information systems The intrusion detection system (IDS) model, which can identify the presence of intruders in the network and take some predefined action for safe data transit across the network, is advantageous in achieving security in both simple and advanced network systems. Several IDS models have various security problems, such as low detection accuracy and high false alarms, which can be caused by the network traffic dataset's excessive dimensionality and class imbalance in the creation of IDS models. Principal Component Analysis (PCA) has proven to be a helpful feature selection technique for dimensionality reduction. As a result, because it is a linear transformation, it has challenges capturing non-linear relationships between feature properties in the network traffic datasets. This paper proposes a variable ensemble machine learning method to solve the problem and achieve a low variance model with high accuracy and low false alarm. First, PCA is combined with the AdaBoost ensemble machine learning algorithm, which acts as stagewise additive modelling to compensate for PCA's deficiency in feature selection in network traffic by minimizing the exponential loss function. Secondly, PCA is used for feature selection, and a LogitBoost classifier algorithm can be used for multiclass classification and acts as an additive tree regression to compensate for the PCA's weakness by minimizing the Logistic Loss to provide an optimal classifier output. Finally, the low variance ability of RandomForest, which employs the bagging approach, is applied to eliminate overfittings. The experiments of the IDS model developed from the proposed methods were evaluated on the WSN-DS, NSL-KDD, and UNSW-N15 datasets. The performance of the methods, PCA with AdaBoost, on the WSN-DS dataset has an accuracy score of 92.3 %, an 89.0 % accuracy score on the NSL-KDD dataset, and a 67.9 % accuracy score on UNSW-N15, which is the least accurate score. PCA and RandomForest surpassed them by scoring 100 % accuracy on all three datasets. PCA and Bagging have an accuracy score of 99.8 % on the WSN-DS dataset, 100 % on the NSL-KDD dataset, and 93.4 % on the UNSW-N15 dataset. In comparison, PCA and LogitBoost have an accuracy score of 98.9 % on the WSN-DS dataset, 100 % on the NSL-KDD dataset, and 88.7 % on the UNSW-N15 dataset. Elsevier B.V. 2024-12 Article PeerReviewed application/pdf en http://eprints.utm.my/109022/1/FarkhanaMuchtar2024_EnhancedInstrusionDetectionModelBasedonPrincipal.pdf John, Ayuba and Isnin, Ismail Fauzi and Madni, Syed Hamid Hussain and Muchtar, Farkhana (2024) Enhanced intrusion detection model based on principal component analysis and variable ensemble machine learning algorithm. Intelligent Systems with Applications, 24 (NA). pp. 1-18. ISSN 2667-3053 http://dx.doi.org/10.1016/j.iswa.2024.200442 DOI:10.1016/j.iswa.2024.200442
spellingShingle T Technology (General)
T58.6-58.62 Management information systems
John, Ayuba
Isnin, Ismail Fauzi
Madni, Syed Hamid Hussain
Muchtar, Farkhana
Enhanced intrusion detection model based on principal component analysis and variable ensemble machine learning algorithm
title Enhanced intrusion detection model based on principal component analysis and variable ensemble machine learning algorithm
title_full Enhanced intrusion detection model based on principal component analysis and variable ensemble machine learning algorithm
title_fullStr Enhanced intrusion detection model based on principal component analysis and variable ensemble machine learning algorithm
title_full_unstemmed Enhanced intrusion detection model based on principal component analysis and variable ensemble machine learning algorithm
title_short Enhanced intrusion detection model based on principal component analysis and variable ensemble machine learning algorithm
title_sort enhanced intrusion detection model based on principal component analysis and variable ensemble machine learning algorithm
topic T Technology (General)
T58.6-58.62 Management information systems
url http://eprints.utm.my/109022/1/FarkhanaMuchtar2024_EnhancedInstrusionDetectionModelBasedonPrincipal.pdf
work_keys_str_mv AT johnayuba enhancedintrusiondetectionmodelbasedonprincipalcomponentanalysisandvariableensemblemachinelearningalgorithm
AT isninismailfauzi enhancedintrusiondetectionmodelbasedonprincipalcomponentanalysisandvariableensemblemachinelearningalgorithm
AT madnisyedhamidhussain enhancedintrusiondetectionmodelbasedonprincipalcomponentanalysisandvariableensemblemachinelearningalgorithm
AT muchtarfarkhana enhancedintrusiondetectionmodelbasedonprincipalcomponentanalysisandvariableensemblemachinelearningalgorithm