Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems

Wi-Fi is arguably the most proliferated wireless technology today. Due to its massive adoption, Wi-Fi deployments always remain in the epicenter of attackers and evildoers. Surprisingly, research regarding machine learning driven intrusion detection systems (IDS) that are specifically optimized to d...

Full description

Bibliographic Details
Main Authors: Efstratios Chatzoglou, Georgios Kambourakis, Constantinos Kolias, Christos Smiliotopoulos
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9797689/
_version_ 1828802739165986816
author Efstratios Chatzoglou
Georgios Kambourakis
Constantinos Kolias
Christos Smiliotopoulos
author_facet Efstratios Chatzoglou
Georgios Kambourakis
Constantinos Kolias
Christos Smiliotopoulos
author_sort Efstratios Chatzoglou
collection DOAJ
description Wi-Fi is arguably the most proliferated wireless technology today. Due to its massive adoption, Wi-Fi deployments always remain in the epicenter of attackers and evildoers. Surprisingly, research regarding machine learning driven intrusion detection systems (IDS) that are specifically optimized to detect Wi-Fi attacks is lagging behind. On top of that, the field is dominated by false or half-true assumptions that potentially can lead to corresponding models being overfilled to certain validation datasets, simply giving the impression or illusion of high efficiency. This work attempts to provide concrete answers to the following key questions regarding IEEE 802.11 machine learning driven IDS. First, from an expert’s viewpoint and with reference to the relevant literature, what are the criteria for determining the smallest possible set of classification features, which are also common and potentially transferable to virtually any deployment types/versions of 802.11? And second, based on these features, what is the detection performance across different network versions and diverse machine learning techniques, i.e., shallow versus deep learning ones? To answer these questions, we rely on the renowned 802.11 security-oriented AWID family of datasets. In a nutshell, our experiments demonstrate that with a rather small set of 16 features and without the use of any optimization or ensemble method, shallow and deep learning classification can achieve an average F1 score of up to 99.55% and 97.55%, respectively. We argue that the suggested human expert driven feature selection leads to lightweight, deployment-agnostic detection systems, and therefore can be used as a basis for future work in this interesting and rapidly evolving field.
first_indexed 2024-12-12T07:09:32Z
format Article
id doaj.art-faaa09f8f9ae424cb73ea9b94dc27a12
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-12T07:09:32Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-faaa09f8f9ae424cb73ea9b94dc27a122022-12-22T00:33:40ZengIEEEIEEE Access2169-35362022-01-0110647616478410.1109/ACCESS.2022.31835979797689Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection SystemsEfstratios Chatzoglou0https://orcid.org/0000-0001-6507-5052Georgios Kambourakis1https://orcid.org/0000-0001-6348-5031Constantinos Kolias2https://orcid.org/0000-0002-3020-291XChristos Smiliotopoulos3https://orcid.org/0000-0001-7530-7152Department of Information and Communication Systems Engineering, University of the Aegean, Samos, GreeceEuropean Commission, Joint Research Centre (JRC), Ispra, ItalyDepartment of Computer Science, University of Idaho, Idaho Falls, ID, USADepartment of Information and Communication Systems Engineering, University of the Aegean, Samos, GreeceWi-Fi is arguably the most proliferated wireless technology today. Due to its massive adoption, Wi-Fi deployments always remain in the epicenter of attackers and evildoers. Surprisingly, research regarding machine learning driven intrusion detection systems (IDS) that are specifically optimized to detect Wi-Fi attacks is lagging behind. On top of that, the field is dominated by false or half-true assumptions that potentially can lead to corresponding models being overfilled to certain validation datasets, simply giving the impression or illusion of high efficiency. This work attempts to provide concrete answers to the following key questions regarding IEEE 802.11 machine learning driven IDS. First, from an expert’s viewpoint and with reference to the relevant literature, what are the criteria for determining the smallest possible set of classification features, which are also common and potentially transferable to virtually any deployment types/versions of 802.11? And second, based on these features, what is the detection performance across different network versions and diverse machine learning techniques, i.e., shallow versus deep learning ones? To answer these questions, we rely on the renowned 802.11 security-oriented AWID family of datasets. In a nutshell, our experiments demonstrate that with a rather small set of 16 features and without the use of any optimization or ensemble method, shallow and deep learning classification can achieve an average F1 score of up to 99.55% and 97.55%, respectively. We argue that the suggested human expert driven feature selection leads to lightweight, deployment-agnostic detection systems, and therefore can be used as a basis for future work in this interesting and rapidly evolving field.https://ieeexplore.ieee.org/document/9797689/Intrusion detectionWiFi80211machine learningdeep learningdataset
spellingShingle Efstratios Chatzoglou
Georgios Kambourakis
Constantinos Kolias
Christos Smiliotopoulos
Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems
IEEE Access
Intrusion detection
WiFi
80211
machine learning
deep learning
dataset
title Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems
title_full Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems
title_fullStr Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems
title_full_unstemmed Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems
title_short Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems
title_sort pick quality over quantity expert feature selection and data preprocessing for 802 11 intrusion detection systems
topic Intrusion detection
WiFi
80211
machine learning
deep learning
dataset
url https://ieeexplore.ieee.org/document/9797689/
work_keys_str_mv AT efstratioschatzoglou pickqualityoverquantityexpertfeatureselectionanddatapreprocessingfor80211intrusiondetectionsystems
AT georgioskambourakis pickqualityoverquantityexpertfeatureselectionanddatapreprocessingfor80211intrusiondetectionsystems
AT constantinoskolias pickqualityoverquantityexpertfeatureselectionanddatapreprocessingfor80211intrusiondetectionsystems
AT christossmiliotopoulos pickqualityoverquantityexpertfeatureselectionanddatapreprocessingfor80211intrusiondetectionsystems