Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data
This article consolidates analysis of established (NSL-KDD) and new intrusion detection datasets (ISCXIDS2012, CICIDS2017, CICIDS2018) through the use of supervised machine learning (ML) algorithms. The uniformity in analysis procedure opens up the option to compare the obtained results. It also pro...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8901110/ |
_version_ | 1818935940011261952 |
---|---|
author | Laurens D'hooge Tim Wauters Bruno Volckaert Filip De Turck |
author_facet | Laurens D'hooge Tim Wauters Bruno Volckaert Filip De Turck |
author_sort | Laurens D'hooge |
collection | DOAJ |
description | This article consolidates analysis of established (NSL-KDD) and new intrusion detection datasets (ISCXIDS2012, CICIDS2017, CICIDS2018) through the use of supervised machine learning (ML) algorithms. The uniformity in analysis procedure opens up the option to compare the obtained results. It also provides a stronger foundation for the conclusions about the efficacy of supervised learners on the main classification task in network security. This research is motivated in part to address the lack of adoption of these modern datasets. Starting with a broad scope that includes classification by algorithms from different families on both established and new datasets has been done to expand the existing foundation and reveal the most opportune avenues for further inquiry. After obtaining baseline results, the classification task was increased in difficulty, by reducing the available data to learn from, both horizontally and vertically. The data reduction has been included as a stress-test to verify if the very high baseline results hold up under increasingly harsh constraints. Ultimately, this work contains the most comprehensive set of results on the topic of intrusion detection through supervised machine learning. Researchers working on algorithmic improvements can compare their results to this collection, knowing that all results reported here were gathered through a uniform framework. This work's main contributions are the outstanding classification results on the current state of the art datasets for intrusion detection and the conclusion that these methods show remarkable resilience in classification performance even when aggressively reducing the amount of data to learn from. |
first_indexed | 2024-12-20T05:28:09Z |
format | Article |
id | doaj.art-b4f9f1e8fdc9472689789805262143a8 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-20T05:28:09Z |
publishDate | 2019-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-b4f9f1e8fdc9472689789805262143a82022-12-21T19:51:50ZengIEEEIEEE Access2169-35362019-01-01716745516746910.1109/ACCESS.2019.29534518901110Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection DataLaurens D'hooge0https://orcid.org/0000-0001-5086-6361Tim Wauters1Bruno Volckaert2Filip De Turck3Department of Information Technology, IDLab, Ghent University–mec, Ghent, BelgiumDepartment of Information Technology, IDLab, Ghent University–mec, Ghent, BelgiumDepartment of Information Technology, IDLab, Ghent University–mec, Ghent, BelgiumDepartment of Information Technology, IDLab, Ghent University–mec, Ghent, BelgiumThis article consolidates analysis of established (NSL-KDD) and new intrusion detection datasets (ISCXIDS2012, CICIDS2017, CICIDS2018) through the use of supervised machine learning (ML) algorithms. The uniformity in analysis procedure opens up the option to compare the obtained results. It also provides a stronger foundation for the conclusions about the efficacy of supervised learners on the main classification task in network security. This research is motivated in part to address the lack of adoption of these modern datasets. Starting with a broad scope that includes classification by algorithms from different families on both established and new datasets has been done to expand the existing foundation and reveal the most opportune avenues for further inquiry. After obtaining baseline results, the classification task was increased in difficulty, by reducing the available data to learn from, both horizontally and vertically. The data reduction has been included as a stress-test to verify if the very high baseline results hold up under increasingly harsh constraints. Ultimately, this work contains the most comprehensive set of results on the topic of intrusion detection through supervised machine learning. Researchers working on algorithmic improvements can compare their results to this collection, knowing that all results reported here were gathered through a uniform framework. This work's main contributions are the outstanding classification results on the current state of the art datasets for intrusion detection and the conclusion that these methods show remarkable resilience in classification performance even when aggressively reducing the amount of data to learn from.https://ieeexplore.ieee.org/document/8901110/CICIDS2017CICIDS2018cyber securityintrusion detectionISCXIDS2012network security |
spellingShingle | Laurens D'hooge Tim Wauters Bruno Volckaert Filip De Turck Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data IEEE Access CICIDS2017 CICIDS2018 cyber security intrusion detection ISCXIDS2012 network security |
title | Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data |
title_full | Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data |
title_fullStr | Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data |
title_full_unstemmed | Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data |
title_short | Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data |
title_sort | classification hardness for supervised learners on 20 years of intrusion detection data |
topic | CICIDS2017 CICIDS2018 cyber security intrusion detection ISCXIDS2012 network security |
url | https://ieeexplore.ieee.org/document/8901110/ |
work_keys_str_mv | AT laurensdhooge classificationhardnessforsupervisedlearnerson20yearsofintrusiondetectiondata AT timwauters classificationhardnessforsupervisedlearnerson20yearsofintrusiondetectiondata AT brunovolckaert classificationhardnessforsupervisedlearnerson20yearsofintrusiondetectiondata AT filipdeturck classificationhardnessforsupervisedlearnerson20yearsofintrusiondetectiondata |