Detecting Reconnaissance and Discovery Tactics from the MITRE ATT&CK Framework in Zeek Conn Logs Using Spark’s Machine Learning in the Big Data Framework

While computer networks and the massive amount of communication taking place on these networks grow, the amount of damage that can be done by network intrusions grows in tandem. The need is for an effective and scalable intrusion detection system (IDS) to address these potential damages that come wi...

Full description

Bibliographic Details
Main Authors: Sikha Bagui, Dustin Mink, Subhash Bagui, Tirthankar Ghosh, Tom McElroy, Esteban Paredes, Nithisha Khasnavis, Russell Plenkers
Format: Article
Language:English
Published: MDPI AG 2022-10-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/22/20/7999
_version_ 1797469939851853824
author Sikha Bagui
Dustin Mink
Subhash Bagui
Tirthankar Ghosh
Tom McElroy
Esteban Paredes
Nithisha Khasnavis
Russell Plenkers
author_facet Sikha Bagui
Dustin Mink
Subhash Bagui
Tirthankar Ghosh
Tom McElroy
Esteban Paredes
Nithisha Khasnavis
Russell Plenkers
author_sort Sikha Bagui
collection DOAJ
description While computer networks and the massive amount of communication taking place on these networks grow, the amount of damage that can be done by network intrusions grows in tandem. The need is for an effective and scalable intrusion detection system (IDS) to address these potential damages that come with the growth of these networks. A great deal of contemporary research on near real-time IDS focuses on applying machine learning classifiers to labeled network intrusion datasets, but these datasets need be relevant pertaining to the currency of the network intrusions. This paper focuses on a newly created dataset, <i>UWF-ZeekData22</i>, that analyzes data from Zeek’s Connection Logs collected using Security Onion 2 network security monitor and labelled using the MITRE ATT&CK framework TTPs. Due to the volume of data, Spark, in the big data framework, was used to run many of the well-known classifiers (naïve Bayes, random forest, decision tree, support vector classifier, gradient boosted trees, and logistic regression) to classify the reconnaissance and discovery tactics from this dataset. In addition to looking at the performance of these classifiers using Spark, scalability and response time were also analyzed.
first_indexed 2024-03-09T19:30:02Z
format Article
id doaj.art-ba42a8f786d44754a7b4f5d80c3c79fe
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T19:30:02Z
publishDate 2022-10-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-ba42a8f786d44754a7b4f5d80c3c79fe2023-11-24T02:30:07ZengMDPI AGSensors1424-82202022-10-012220799910.3390/s22207999Detecting Reconnaissance and Discovery Tactics from the MITRE ATT&CK Framework in Zeek Conn Logs Using Spark’s Machine Learning in the Big Data FrameworkSikha Bagui0Dustin Mink1Subhash Bagui2Tirthankar Ghosh3Tom McElroy4Esteban Paredes5Nithisha Khasnavis6Russell Plenkers7Department of Computer Science, University of West Florida, Pensacola, FL 32514, USADepartment of Computer Science, University of West Florida, Pensacola, FL 32514, USADepartment of Mathematics and Statistics, University of West Florida, Pensacola, FL 32514, USADepartment of Computer Science, University of West Florida, Pensacola, FL 32514, USADepartment of Computer Science, University of West Florida, Pensacola, FL 32514, USADepartment of Computer Science, University of West Florida, Pensacola, FL 32514, USADepartment of Computer Science, University of West Florida, Pensacola, FL 32514, USADepartment of Computer Science, University of West Florida, Pensacola, FL 32514, USAWhile computer networks and the massive amount of communication taking place on these networks grow, the amount of damage that can be done by network intrusions grows in tandem. The need is for an effective and scalable intrusion detection system (IDS) to address these potential damages that come with the growth of these networks. A great deal of contemporary research on near real-time IDS focuses on applying machine learning classifiers to labeled network intrusion datasets, but these datasets need be relevant pertaining to the currency of the network intrusions. This paper focuses on a newly created dataset, <i>UWF-ZeekData22</i>, that analyzes data from Zeek’s Connection Logs collected using Security Onion 2 network security monitor and labelled using the MITRE ATT&CK framework TTPs. Due to the volume of data, Spark, in the big data framework, was used to run many of the well-known classifiers (naïve Bayes, random forest, decision tree, support vector classifier, gradient boosted trees, and logistic regression) to classify the reconnaissance and discovery tactics from this dataset. In addition to looking at the performance of these classifiers using Spark, scalability and response time were also analyzed.https://www.mdpi.com/1424-8220/22/20/7999Apache Sparkbig datanetwork traffic analysisintrusion detection systemsmachine learningZeek Connection Logs
spellingShingle Sikha Bagui
Dustin Mink
Subhash Bagui
Tirthankar Ghosh
Tom McElroy
Esteban Paredes
Nithisha Khasnavis
Russell Plenkers
Detecting Reconnaissance and Discovery Tactics from the MITRE ATT&CK Framework in Zeek Conn Logs Using Spark’s Machine Learning in the Big Data Framework
Sensors
Apache Spark
big data
network traffic analysis
intrusion detection systems
machine learning
Zeek Connection Logs
title Detecting Reconnaissance and Discovery Tactics from the MITRE ATT&CK Framework in Zeek Conn Logs Using Spark’s Machine Learning in the Big Data Framework
title_full Detecting Reconnaissance and Discovery Tactics from the MITRE ATT&CK Framework in Zeek Conn Logs Using Spark’s Machine Learning in the Big Data Framework
title_fullStr Detecting Reconnaissance and Discovery Tactics from the MITRE ATT&CK Framework in Zeek Conn Logs Using Spark’s Machine Learning in the Big Data Framework
title_full_unstemmed Detecting Reconnaissance and Discovery Tactics from the MITRE ATT&CK Framework in Zeek Conn Logs Using Spark’s Machine Learning in the Big Data Framework
title_short Detecting Reconnaissance and Discovery Tactics from the MITRE ATT&CK Framework in Zeek Conn Logs Using Spark’s Machine Learning in the Big Data Framework
title_sort detecting reconnaissance and discovery tactics from the mitre att ck framework in zeek conn logs using spark s machine learning in the big data framework
topic Apache Spark
big data
network traffic analysis
intrusion detection systems
machine learning
Zeek Connection Logs
url https://www.mdpi.com/1424-8220/22/20/7999
work_keys_str_mv AT sikhabagui detectingreconnaissanceanddiscoverytacticsfromthemitreattckframeworkinzeekconnlogsusingsparksmachinelearninginthebigdataframework
AT dustinmink detectingreconnaissanceanddiscoverytacticsfromthemitreattckframeworkinzeekconnlogsusingsparksmachinelearninginthebigdataframework
AT subhashbagui detectingreconnaissanceanddiscoverytacticsfromthemitreattckframeworkinzeekconnlogsusingsparksmachinelearninginthebigdataframework
AT tirthankarghosh detectingreconnaissanceanddiscoverytacticsfromthemitreattckframeworkinzeekconnlogsusingsparksmachinelearninginthebigdataframework
AT tommcelroy detectingreconnaissanceanddiscoverytacticsfromthemitreattckframeworkinzeekconnlogsusingsparksmachinelearninginthebigdataframework
AT estebanparedes detectingreconnaissanceanddiscoverytacticsfromthemitreattckframeworkinzeekconnlogsusingsparksmachinelearninginthebigdataframework
AT nithishakhasnavis detectingreconnaissanceanddiscoverytacticsfromthemitreattckframeworkinzeekconnlogsusingsparksmachinelearninginthebigdataframework
AT russellplenkers detectingreconnaissanceanddiscoverytacticsfromthemitreattckframeworkinzeekconnlogsusingsparksmachinelearninginthebigdataframework