Models versus Datasets: Reducing Bias through Building a Comprehensive IDS Benchmark

Today, deep learning approaches are widely used to build Intrusion Detection Systems for securing IoT environments. However, the models’ hidden and complex nature raises various concerns, such as trusting the model output and understanding why the model made certain decisions. Researchers generally...

Full description

Bibliographic Details
Main Authors: Rasheed Ahmad, Izzat Alsmadi, Wasim Alhamdani, Lo’ai Tawalbeh
Format: Article
Language:English
Published: MDPI AG 2021-12-01
Series:Future Internet
Subjects:
Online Access:https://www.mdpi.com/1999-5903/13/12/318
_version_ 1827672362411098112
author Rasheed Ahmad
Izzat Alsmadi
Wasim Alhamdani
Lo’ai Tawalbeh
author_facet Rasheed Ahmad
Izzat Alsmadi
Wasim Alhamdani
Lo’ai Tawalbeh
author_sort Rasheed Ahmad
collection DOAJ
description Today, deep learning approaches are widely used to build Intrusion Detection Systems for securing IoT environments. However, the models’ hidden and complex nature raises various concerns, such as trusting the model output and understanding why the model made certain decisions. Researchers generally publish their proposed model’s settings and performance results based on a specific dataset and a classification model but do not report the proposed model’s output and findings. Similarly, many researchers suggest an IDS solution by focusing only on a single benchmark dataset and classifier. Such solutions are prone to generating inaccurate and biased results. This paper overcomes these limitations in previous work by analyzing various benchmark datasets and various individual and hybrid deep learning classifiers towards finding the best IDS solution for IoT that is efficient, lightweight, and comprehensive in detecting network anomalies. We also showed the model’s localized predictions and analyzed the top contributing features impacting the global performance of deep learning models. This paper aims to extract the aggregate knowledge from various datasets and classifiers and analyze the commonalities to avoid any possible bias in results and increase the trust and transparency of deep learning models. We believe this paper’s findings will help future researchers build a comprehensive IDS based on well-performing classifiers and utilize the aggregated knowledge and the minimum set of significantly contributing features.
first_indexed 2024-03-10T04:05:26Z
format Article
id doaj.art-7a1c0be896364fb68fcb8fb74b9b475e
institution Directory Open Access Journal
issn 1999-5903
language English
last_indexed 2024-03-10T04:05:26Z
publishDate 2021-12-01
publisher MDPI AG
record_format Article
series Future Internet
spelling doaj.art-7a1c0be896364fb68fcb8fb74b9b475e2023-11-23T08:25:15ZengMDPI AGFuture Internet1999-59032021-12-01131231810.3390/fi13120318Models versus Datasets: Reducing Bias through Building a Comprehensive IDS BenchmarkRasheed Ahmad0Izzat Alsmadi1Wasim Alhamdani2Lo’ai Tawalbeh3Department of Computer Information Sciences, University of the Cumberlands, 6178 College Station Drive, Williamsburg, KY 40769, USADepartment of computing and cyber security, University of Texas A&M San Antonio, One University Way, San Antonio, TX 78224, USADepartment of Computer Information Sciences, University of the Cumberlands, 6178 College Station Drive, Williamsburg, KY 40769, USADepartment of computing and cyber security, University of Texas A&M San Antonio, One University Way, San Antonio, TX 78224, USAToday, deep learning approaches are widely used to build Intrusion Detection Systems for securing IoT environments. However, the models’ hidden and complex nature raises various concerns, such as trusting the model output and understanding why the model made certain decisions. Researchers generally publish their proposed model’s settings and performance results based on a specific dataset and a classification model but do not report the proposed model’s output and findings. Similarly, many researchers suggest an IDS solution by focusing only on a single benchmark dataset and classifier. Such solutions are prone to generating inaccurate and biased results. This paper overcomes these limitations in previous work by analyzing various benchmark datasets and various individual and hybrid deep learning classifiers towards finding the best IDS solution for IoT that is efficient, lightweight, and comprehensive in detecting network anomalies. We also showed the model’s localized predictions and analyzed the top contributing features impacting the global performance of deep learning models. This paper aims to extract the aggregate knowledge from various datasets and classifiers and analyze the commonalities to avoid any possible bias in results and increase the trust and transparency of deep learning models. We believe this paper’s findings will help future researchers build a comprehensive IDS based on well-performing classifiers and utilize the aggregated knowledge and the minimum set of significantly contributing features.https://www.mdpi.com/1999-5903/13/12/318Intrusion Detection System (IDS)deep learningfeature extractionInternet of Things (IoT)model interpretation
spellingShingle Rasheed Ahmad
Izzat Alsmadi
Wasim Alhamdani
Lo’ai Tawalbeh
Models versus Datasets: Reducing Bias through Building a Comprehensive IDS Benchmark
Future Internet
Intrusion Detection System (IDS)
deep learning
feature extraction
Internet of Things (IoT)
model interpretation
title Models versus Datasets: Reducing Bias through Building a Comprehensive IDS Benchmark
title_full Models versus Datasets: Reducing Bias through Building a Comprehensive IDS Benchmark
title_fullStr Models versus Datasets: Reducing Bias through Building a Comprehensive IDS Benchmark
title_full_unstemmed Models versus Datasets: Reducing Bias through Building a Comprehensive IDS Benchmark
title_short Models versus Datasets: Reducing Bias through Building a Comprehensive IDS Benchmark
title_sort models versus datasets reducing bias through building a comprehensive ids benchmark
topic Intrusion Detection System (IDS)
deep learning
feature extraction
Internet of Things (IoT)
model interpretation
url https://www.mdpi.com/1999-5903/13/12/318
work_keys_str_mv AT rasheedahmad modelsversusdatasetsreducingbiasthroughbuildingacomprehensiveidsbenchmark
AT izzatalsmadi modelsversusdatasetsreducingbiasthroughbuildingacomprehensiveidsbenchmark
AT wasimalhamdani modelsversusdatasetsreducingbiasthroughbuildingacomprehensiveidsbenchmark
AT loaitawalbeh modelsversusdatasetsreducingbiasthroughbuildingacomprehensiveidsbenchmark