A Multi-Agent Intrusion Detection System Optimized by a Deep Reinforcement Learning Approach with a Dataset Enlarged Using a Generative Model to Reduce the Bias Effect

Intrusion detection systems can defectively perform when they are adjusted with datasets that are unbalanced in terms of attack data and non-attack data. Most datasets contain more non-attack data than attack data, and this circumstance can introduce biases in intrusion detection systems, making the...

Full description

Bibliographic Details
Main Authors: Matthieu Mouyart, Guilherme Medeiros Machado, Jae-Yun Jun
Format: Article
Language:English
Published: MDPI AG 2023-09-01
Series:Journal of Sensor and Actuator Networks
Subjects:
Online Access:https://www.mdpi.com/2224-2708/12/5/68
_version_ 1827720690251333632
author Matthieu Mouyart
Guilherme Medeiros Machado
Jae-Yun Jun
author_facet Matthieu Mouyart
Guilherme Medeiros Machado
Jae-Yun Jun
author_sort Matthieu Mouyart
collection DOAJ
description Intrusion detection systems can defectively perform when they are adjusted with datasets that are unbalanced in terms of attack data and non-attack data. Most datasets contain more non-attack data than attack data, and this circumstance can introduce biases in intrusion detection systems, making them vulnerable to cyberattacks. As an approach to remedy this issue, we considered the Conditional Tabular Generative Adversarial Network (CTGAN), with its hyperparameters optimized using the tree-structured Parzen estimator (TPE), to balance an insider threat tabular dataset called the CMU-CERT, which is formed by discrete-value and continuous-value columns. We showed through this method that the mean absolute errors between the probability mass functions (PMFs) of the actual data and the PMFs of the data generated using the CTGAN can be relatively small. Then, from the optimized CTGAN, we generated synthetic insider threat data and combined them with the actual ones to balance the original dataset. We used the resulting dataset for an intrusion detection system implemented with the Adversarial Environment Reinforcement Learning (AE-RL) algorithm in a multi-agent framework formed by an attacker and a defender. We showed that the performance of detecting intrusions using the framework of the CTGAN and the AE-RL is significantly improved with respect to the case where the dataset is not balanced, giving an F1-score of 0.7617.
first_indexed 2024-03-10T21:08:00Z
format Article
id doaj.art-8beb2947ba5c45739a853899100c82da
institution Directory Open Access Journal
issn 2224-2708
language English
last_indexed 2024-03-10T21:08:00Z
publishDate 2023-09-01
publisher MDPI AG
record_format Article
series Journal of Sensor and Actuator Networks
spelling doaj.art-8beb2947ba5c45739a853899100c82da2023-11-19T17:02:54ZengMDPI AGJournal of Sensor and Actuator Networks2224-27082023-09-011256810.3390/jsan12050068A Multi-Agent Intrusion Detection System Optimized by a Deep Reinforcement Learning Approach with a Dataset Enlarged Using a Generative Model to Reduce the Bias EffectMatthieu Mouyart0Guilherme Medeiros Machado1Jae-Yun Jun2LyRIDS, ECE Paris, 10 rue Sextius Michel, 75015 Paris, FranceLyRIDS, ECE Paris, 10 rue Sextius Michel, 75015 Paris, FranceLyRIDS, ECE Paris, 10 rue Sextius Michel, 75015 Paris, FranceIntrusion detection systems can defectively perform when they are adjusted with datasets that are unbalanced in terms of attack data and non-attack data. Most datasets contain more non-attack data than attack data, and this circumstance can introduce biases in intrusion detection systems, making them vulnerable to cyberattacks. As an approach to remedy this issue, we considered the Conditional Tabular Generative Adversarial Network (CTGAN), with its hyperparameters optimized using the tree-structured Parzen estimator (TPE), to balance an insider threat tabular dataset called the CMU-CERT, which is formed by discrete-value and continuous-value columns. We showed through this method that the mean absolute errors between the probability mass functions (PMFs) of the actual data and the PMFs of the data generated using the CTGAN can be relatively small. Then, from the optimized CTGAN, we generated synthetic insider threat data and combined them with the actual ones to balance the original dataset. We used the resulting dataset for an intrusion detection system implemented with the Adversarial Environment Reinforcement Learning (AE-RL) algorithm in a multi-agent framework formed by an attacker and a defender. We showed that the performance of detecting intrusions using the framework of the CTGAN and the AE-RL is significantly improved with respect to the case where the dataset is not balanced, giving an F1-score of 0.7617.https://www.mdpi.com/2224-2708/12/5/68cybersecurityintrusion detectioninsider threatmulti-agent systemgenerative adversarial networkdeep reinforcement learning
spellingShingle Matthieu Mouyart
Guilherme Medeiros Machado
Jae-Yun Jun
A Multi-Agent Intrusion Detection System Optimized by a Deep Reinforcement Learning Approach with a Dataset Enlarged Using a Generative Model to Reduce the Bias Effect
Journal of Sensor and Actuator Networks
cybersecurity
intrusion detection
insider threat
multi-agent system
generative adversarial network
deep reinforcement learning
title A Multi-Agent Intrusion Detection System Optimized by a Deep Reinforcement Learning Approach with a Dataset Enlarged Using a Generative Model to Reduce the Bias Effect
title_full A Multi-Agent Intrusion Detection System Optimized by a Deep Reinforcement Learning Approach with a Dataset Enlarged Using a Generative Model to Reduce the Bias Effect
title_fullStr A Multi-Agent Intrusion Detection System Optimized by a Deep Reinforcement Learning Approach with a Dataset Enlarged Using a Generative Model to Reduce the Bias Effect
title_full_unstemmed A Multi-Agent Intrusion Detection System Optimized by a Deep Reinforcement Learning Approach with a Dataset Enlarged Using a Generative Model to Reduce the Bias Effect
title_short A Multi-Agent Intrusion Detection System Optimized by a Deep Reinforcement Learning Approach with a Dataset Enlarged Using a Generative Model to Reduce the Bias Effect
title_sort multi agent intrusion detection system optimized by a deep reinforcement learning approach with a dataset enlarged using a generative model to reduce the bias effect
topic cybersecurity
intrusion detection
insider threat
multi-agent system
generative adversarial network
deep reinforcement learning
url https://www.mdpi.com/2224-2708/12/5/68
work_keys_str_mv AT matthieumouyart amultiagentintrusiondetectionsystemoptimizedbyadeepreinforcementlearningapproachwithadatasetenlargedusingagenerativemodeltoreducethebiaseffect
AT guilhermemedeirosmachado amultiagentintrusiondetectionsystemoptimizedbyadeepreinforcementlearningapproachwithadatasetenlargedusingagenerativemodeltoreducethebiaseffect
AT jaeyunjun amultiagentintrusiondetectionsystemoptimizedbyadeepreinforcementlearningapproachwithadatasetenlargedusingagenerativemodeltoreducethebiaseffect
AT matthieumouyart multiagentintrusiondetectionsystemoptimizedbyadeepreinforcementlearningapproachwithadatasetenlargedusingagenerativemodeltoreducethebiaseffect
AT guilhermemedeirosmachado multiagentintrusiondetectionsystemoptimizedbyadeepreinforcementlearningapproachwithadatasetenlargedusingagenerativemodeltoreducethebiaseffect
AT jaeyunjun multiagentintrusiondetectionsystemoptimizedbyadeepreinforcementlearningapproachwithadatasetenlargedusingagenerativemodeltoreducethebiaseffect