Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data

This article consolidates analysis of established (NSL-KDD) and new intrusion detection datasets (ISCXIDS2012, CICIDS2017, CICIDS2018) through the use of supervised machine learning (ML) algorithms. The uniformity in analysis procedure opens up the option to compare the obtained results. It also pro...

Full description

Bibliographic Details
Main Authors:	Laurens D'hooge, Tim Wauters, Bruno Volckaert, Filip De Turck
Format:	Article
Language:	English
Published:	IEEE 2019-01-01
Series:	IEEE Access
Subjects:	CICIDS2017 CICIDS2018 cyber security intrusion detection ISCXIDS2012 network security
Online Access:	https://ieeexplore.ieee.org/document/8901110/

_version_	1818935940011261952
author	Laurens D'hooge Tim Wauters Bruno Volckaert Filip De Turck
author_facet	Laurens D'hooge Tim Wauters Bruno Volckaert Filip De Turck
author_sort	Laurens D'hooge
collection	DOAJ
description	This article consolidates analysis of established (NSL-KDD) and new intrusion detection datasets (ISCXIDS2012, CICIDS2017, CICIDS2018) through the use of supervised machine learning (ML) algorithms. The uniformity in analysis procedure opens up the option to compare the obtained results. It also provides a stronger foundation for the conclusions about the efficacy of supervised learners on the main classification task in network security. This research is motivated in part to address the lack of adoption of these modern datasets. Starting with a broad scope that includes classification by algorithms from different families on both established and new datasets has been done to expand the existing foundation and reveal the most opportune avenues for further inquiry. After obtaining baseline results, the classification task was increased in difficulty, by reducing the available data to learn from, both horizontally and vertically. The data reduction has been included as a stress-test to verify if the very high baseline results hold up under increasingly harsh constraints. Ultimately, this work contains the most comprehensive set of results on the topic of intrusion detection through supervised machine learning. Researchers working on algorithmic improvements can compare their results to this collection, knowing that all results reported here were gathered through a uniform framework. This work's main contributions are the outstanding classification results on the current state of the art datasets for intrusion detection and the conclusion that these methods show remarkable resilience in classification performance even when aggressively reducing the amount of data to learn from.
first_indexed	2024-12-20T05:28:09Z
format	Article
id	doaj.art-b4f9f1e8fdc9472689789805262143a8
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-20T05:28:09Z
publishDate	2019-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-b4f9f1e8fdc9472689789805262143a82022-12-21T19:51:50ZengIEEEIEEE Access2169-35362019-01-01716745516746910.1109/ACCESS.2019.29534518901110Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection DataLaurens D'hooge0https://orcid.org/0000-0001-5086-6361Tim Wauters1Bruno Volckaert2Filip De Turck3Department of Information Technology, IDLab, Ghent University–mec, Ghent, BelgiumDepartment of Information Technology, IDLab, Ghent University–mec, Ghent, BelgiumDepartment of Information Technology, IDLab, Ghent University–mec, Ghent, BelgiumDepartment of Information Technology, IDLab, Ghent University–mec, Ghent, BelgiumThis article consolidates analysis of established (NSL-KDD) and new intrusion detection datasets (ISCXIDS2012, CICIDS2017, CICIDS2018) through the use of supervised machine learning (ML) algorithms. The uniformity in analysis procedure opens up the option to compare the obtained results. It also provides a stronger foundation for the conclusions about the efficacy of supervised learners on the main classification task in network security. This research is motivated in part to address the lack of adoption of these modern datasets. Starting with a broad scope that includes classification by algorithms from different families on both established and new datasets has been done to expand the existing foundation and reveal the most opportune avenues for further inquiry. After obtaining baseline results, the classification task was increased in difficulty, by reducing the available data to learn from, both horizontally and vertically. The data reduction has been included as a stress-test to verify if the very high baseline results hold up under increasingly harsh constraints. Ultimately, this work contains the most comprehensive set of results on the topic of intrusion detection through supervised machine learning. Researchers working on algorithmic improvements can compare their results to this collection, knowing that all results reported here were gathered through a uniform framework. This work's main contributions are the outstanding classification results on the current state of the art datasets for intrusion detection and the conclusion that these methods show remarkable resilience in classification performance even when aggressively reducing the amount of data to learn from.https://ieeexplore.ieee.org/document/8901110/CICIDS2017CICIDS2018cyber securityintrusion detectionISCXIDS2012network security
spellingShingle	Laurens D'hooge Tim Wauters Bruno Volckaert Filip De Turck Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data IEEE Access CICIDS2017 CICIDS2018 cyber security intrusion detection ISCXIDS2012 network security
title	Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data
title_full	Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data
title_fullStr	Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data
title_full_unstemmed	Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data
title_short	Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data
title_sort	classification hardness for supervised learners on 20 years of intrusion detection data
topic	CICIDS2017 CICIDS2018 cyber security intrusion detection ISCXIDS2012 network security
url	https://ieeexplore.ieee.org/document/8901110/
work_keys_str_mv	AT laurensdhooge classificationhardnessforsupervisedlearnerson20yearsofintrusiondetectiondata AT timwauters classificationhardnessforsupervisedlearnerson20yearsofintrusiondetectiondata AT brunovolckaert classificationhardnessforsupervisedlearnerson20yearsofintrusiondetectiondata AT filipdeturck classificationhardnessforsupervisedlearnerson20yearsofintrusiondetectiondata

Classification Hardness for Supervised Learners on 20 Years of Intrusion Detection Data

Similar Items