An Insight into the Machine-Learning-Based Fileless Malware Detection

In recent years, massive development in the malware industry changed the entire landscape for malware development. Therefore, cybercriminals became more sophisticated by advancing their development techniques from file-based to fileless malware. As file-based malware depends on files to spread itsel...

Full description

Bibliographic Details
Main Authors: Osama Khalid, Subhan Ullah, Tahir Ahmad, Saqib Saeed, Dina A. Alabbad, Mudassar Aslam, Attaullah Buriro, Rizwan Ahmad
Format: Article
Language:English
Published: MDPI AG 2023-01-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/23/2/612
_version_ 1827622204696690688
author Osama Khalid
Subhan Ullah
Tahir Ahmad
Saqib Saeed
Dina A. Alabbad
Mudassar Aslam
Attaullah Buriro
Rizwan Ahmad
author_facet Osama Khalid
Subhan Ullah
Tahir Ahmad
Saqib Saeed
Dina A. Alabbad
Mudassar Aslam
Attaullah Buriro
Rizwan Ahmad
author_sort Osama Khalid
collection DOAJ
description In recent years, massive development in the malware industry changed the entire landscape for malware development. Therefore, cybercriminals became more sophisticated by advancing their development techniques from file-based to fileless malware. As file-based malware depends on files to spread itself, on the other hand, fileless malware does not require a traditional file system and uses benign processes to carry out its malicious intent. Therefore, it evades conventional detection techniques and remains stealthy. This paper briefly explains fileless malware, its life cycle, and its infection chain. Moreover, it proposes a detection technique based on feature analysis using machine learning for fileless malware detection. The virtual machine acquired the memory dumps upon executing the malicious and non-malicious samples. Then the necessary features are extracted using the Volatility memory forensics tool, which is then analyzed using machine learning classification algorithms. After that, the best algorithm is selected based on the k-fold cross-validation score. Experimental evaluation has shown that Random Forest outperforms other machine learning classifiers (Decision Tree, Support Vector Machine, Logistic Regression, K-Nearest Neighbor, XGBoost, and Gradient Boosting). It achieved an overall accuracy of 93.33% with a True Positive Rate (TPR) of 87.5% at zeroFalse Positive Rate (FPR) for fileless malware collected from five widely used datasets (VirusShare, AnyRun, PolySwarm, HatchingTriage, and JoESadbox).
first_indexed 2024-03-09T11:18:15Z
format Article
id doaj.art-72e3a887af1045cc9013ba83076e1fea
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T11:18:15Z
publishDate 2023-01-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-72e3a887af1045cc9013ba83076e1fea2023-12-01T00:24:37ZengMDPI AGSensors1424-82202023-01-0123261210.3390/s23020612An Insight into the Machine-Learning-Based Fileless Malware DetectionOsama Khalid0Subhan Ullah1Tahir Ahmad2Saqib Saeed3Dina A. Alabbad4Mudassar Aslam5Attaullah Buriro6Rizwan Ahmad7FAST School of Computing, National University of Computer and Emerging Sciences (NUCES-FAST), Islamabad 44000, PakistanFAST School of Computing, National University of Computer and Emerging Sciences (NUCES-FAST), Islamabad 44000, PakistanCenter for Cybersecurity, Brunno Kessler Foundation, 38123 Trento, ItalySAUDI ARAMCO Cybersecurity Chair, Department of Computer Information Systems, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 31441, Saudi ArabiaSAUDI ARAMCO Cybersecurity Chair, Department of Computer Engineering, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 31441, Saudi ArabiaFAST School of Computing, National University of Computer and Emerging Sciences (NUCES-FAST), Islamabad 44000, PakistanFaculty of Computer Science, Free University Bozen-Bolzano, 39100 Bolzano, ItalySchool of Electrical Engineering and Computer Science, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanIn recent years, massive development in the malware industry changed the entire landscape for malware development. Therefore, cybercriminals became more sophisticated by advancing their development techniques from file-based to fileless malware. As file-based malware depends on files to spread itself, on the other hand, fileless malware does not require a traditional file system and uses benign processes to carry out its malicious intent. Therefore, it evades conventional detection techniques and remains stealthy. This paper briefly explains fileless malware, its life cycle, and its infection chain. Moreover, it proposes a detection technique based on feature analysis using machine learning for fileless malware detection. The virtual machine acquired the memory dumps upon executing the malicious and non-malicious samples. Then the necessary features are extracted using the Volatility memory forensics tool, which is then analyzed using machine learning classification algorithms. After that, the best algorithm is selected based on the k-fold cross-validation score. Experimental evaluation has shown that Random Forest outperforms other machine learning classifiers (Decision Tree, Support Vector Machine, Logistic Regression, K-Nearest Neighbor, XGBoost, and Gradient Boosting). It achieved an overall accuracy of 93.33% with a True Positive Rate (TPR) of 87.5% at zeroFalse Positive Rate (FPR) for fileless malware collected from five widely used datasets (VirusShare, AnyRun, PolySwarm, HatchingTriage, and JoESadbox).https://www.mdpi.com/1424-8220/23/2/612malwarefilelss malwarevolatilitycybercrimesmachine learningmemory forensics
spellingShingle Osama Khalid
Subhan Ullah
Tahir Ahmad
Saqib Saeed
Dina A. Alabbad
Mudassar Aslam
Attaullah Buriro
Rizwan Ahmad
An Insight into the Machine-Learning-Based Fileless Malware Detection
Sensors
malware
filelss malware
volatility
cybercrimes
machine learning
memory forensics
title An Insight into the Machine-Learning-Based Fileless Malware Detection
title_full An Insight into the Machine-Learning-Based Fileless Malware Detection
title_fullStr An Insight into the Machine-Learning-Based Fileless Malware Detection
title_full_unstemmed An Insight into the Machine-Learning-Based Fileless Malware Detection
title_short An Insight into the Machine-Learning-Based Fileless Malware Detection
title_sort insight into the machine learning based fileless malware detection
topic malware
filelss malware
volatility
cybercrimes
machine learning
memory forensics
url https://www.mdpi.com/1424-8220/23/2/612
work_keys_str_mv AT osamakhalid aninsightintothemachinelearningbasedfilelessmalwaredetection
AT subhanullah aninsightintothemachinelearningbasedfilelessmalwaredetection
AT tahirahmad aninsightintothemachinelearningbasedfilelessmalwaredetection
AT saqibsaeed aninsightintothemachinelearningbasedfilelessmalwaredetection
AT dinaaalabbad aninsightintothemachinelearningbasedfilelessmalwaredetection
AT mudassaraslam aninsightintothemachinelearningbasedfilelessmalwaredetection
AT attaullahburiro aninsightintothemachinelearningbasedfilelessmalwaredetection
AT rizwanahmad aninsightintothemachinelearningbasedfilelessmalwaredetection
AT osamakhalid insightintothemachinelearningbasedfilelessmalwaredetection
AT subhanullah insightintothemachinelearningbasedfilelessmalwaredetection
AT tahirahmad insightintothemachinelearningbasedfilelessmalwaredetection
AT saqibsaeed insightintothemachinelearningbasedfilelessmalwaredetection
AT dinaaalabbad insightintothemachinelearningbasedfilelessmalwaredetection
AT mudassaraslam insightintothemachinelearningbasedfilelessmalwaredetection
AT attaullahburiro insightintothemachinelearningbasedfilelessmalwaredetection
AT rizwanahmad insightintothemachinelearningbasedfilelessmalwaredetection