Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems
Wi-Fi is arguably the most proliferated wireless technology today. Due to its massive adoption, Wi-Fi deployments always remain in the epicenter of attackers and evildoers. Surprisingly, research regarding machine learning driven intrusion detection systems (IDS) that are specifically optimized to d...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9797689/ |
_version_ | 1828802739165986816 |
---|---|
author | Efstratios Chatzoglou Georgios Kambourakis Constantinos Kolias Christos Smiliotopoulos |
author_facet | Efstratios Chatzoglou Georgios Kambourakis Constantinos Kolias Christos Smiliotopoulos |
author_sort | Efstratios Chatzoglou |
collection | DOAJ |
description | Wi-Fi is arguably the most proliferated wireless technology today. Due to its massive adoption, Wi-Fi deployments always remain in the epicenter of attackers and evildoers. Surprisingly, research regarding machine learning driven intrusion detection systems (IDS) that are specifically optimized to detect Wi-Fi attacks is lagging behind. On top of that, the field is dominated by false or half-true assumptions that potentially can lead to corresponding models being overfilled to certain validation datasets, simply giving the impression or illusion of high efficiency. This work attempts to provide concrete answers to the following key questions regarding IEEE 802.11 machine learning driven IDS. First, from an expert’s viewpoint and with reference to the relevant literature, what are the criteria for determining the smallest possible set of classification features, which are also common and potentially transferable to virtually any deployment types/versions of 802.11? And second, based on these features, what is the detection performance across different network versions and diverse machine learning techniques, i.e., shallow versus deep learning ones? To answer these questions, we rely on the renowned 802.11 security-oriented AWID family of datasets. In a nutshell, our experiments demonstrate that with a rather small set of 16 features and without the use of any optimization or ensemble method, shallow and deep learning classification can achieve an average F1 score of up to 99.55% and 97.55%, respectively. We argue that the suggested human expert driven feature selection leads to lightweight, deployment-agnostic detection systems, and therefore can be used as a basis for future work in this interesting and rapidly evolving field. |
first_indexed | 2024-12-12T07:09:32Z |
format | Article |
id | doaj.art-faaa09f8f9ae424cb73ea9b94dc27a12 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-12T07:09:32Z |
publishDate | 2022-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-faaa09f8f9ae424cb73ea9b94dc27a122022-12-22T00:33:40ZengIEEEIEEE Access2169-35362022-01-0110647616478410.1109/ACCESS.2022.31835979797689Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection SystemsEfstratios Chatzoglou0https://orcid.org/0000-0001-6507-5052Georgios Kambourakis1https://orcid.org/0000-0001-6348-5031Constantinos Kolias2https://orcid.org/0000-0002-3020-291XChristos Smiliotopoulos3https://orcid.org/0000-0001-7530-7152Department of Information and Communication Systems Engineering, University of the Aegean, Samos, GreeceEuropean Commission, Joint Research Centre (JRC), Ispra, ItalyDepartment of Computer Science, University of Idaho, Idaho Falls, ID, USADepartment of Information and Communication Systems Engineering, University of the Aegean, Samos, GreeceWi-Fi is arguably the most proliferated wireless technology today. Due to its massive adoption, Wi-Fi deployments always remain in the epicenter of attackers and evildoers. Surprisingly, research regarding machine learning driven intrusion detection systems (IDS) that are specifically optimized to detect Wi-Fi attacks is lagging behind. On top of that, the field is dominated by false or half-true assumptions that potentially can lead to corresponding models being overfilled to certain validation datasets, simply giving the impression or illusion of high efficiency. This work attempts to provide concrete answers to the following key questions regarding IEEE 802.11 machine learning driven IDS. First, from an expert’s viewpoint and with reference to the relevant literature, what are the criteria for determining the smallest possible set of classification features, which are also common and potentially transferable to virtually any deployment types/versions of 802.11? And second, based on these features, what is the detection performance across different network versions and diverse machine learning techniques, i.e., shallow versus deep learning ones? To answer these questions, we rely on the renowned 802.11 security-oriented AWID family of datasets. In a nutshell, our experiments demonstrate that with a rather small set of 16 features and without the use of any optimization or ensemble method, shallow and deep learning classification can achieve an average F1 score of up to 99.55% and 97.55%, respectively. We argue that the suggested human expert driven feature selection leads to lightweight, deployment-agnostic detection systems, and therefore can be used as a basis for future work in this interesting and rapidly evolving field.https://ieeexplore.ieee.org/document/9797689/Intrusion detectionWiFi80211machine learningdeep learningdataset |
spellingShingle | Efstratios Chatzoglou Georgios Kambourakis Constantinos Kolias Christos Smiliotopoulos Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems IEEE Access Intrusion detection WiFi 80211 machine learning deep learning dataset |
title | Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems |
title_full | Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems |
title_fullStr | Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems |
title_full_unstemmed | Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems |
title_short | Pick Quality Over Quantity: Expert Feature Selection and Data Preprocessing for 802.11 Intrusion Detection Systems |
title_sort | pick quality over quantity expert feature selection and data preprocessing for 802 11 intrusion detection systems |
topic | Intrusion detection WiFi 80211 machine learning deep learning dataset |
url | https://ieeexplore.ieee.org/document/9797689/ |
work_keys_str_mv | AT efstratioschatzoglou pickqualityoverquantityexpertfeatureselectionanddatapreprocessingfor80211intrusiondetectionsystems AT georgioskambourakis pickqualityoverquantityexpertfeatureselectionanddatapreprocessingfor80211intrusiondetectionsystems AT constantinoskolias pickqualityoverquantityexpertfeatureselectionanddatapreprocessingfor80211intrusiondetectionsystems AT christossmiliotopoulos pickqualityoverquantityexpertfeatureselectionanddatapreprocessingfor80211intrusiondetectionsystems |