Lightweight and Robust Malware Detection Using Dictionaries of API Calls

Malware in today’s business world has become a powerful tool used by cyber attackers. It has become more advanced, spreading quickly and causing significant harm. Modern malware is particularly dangerous because it can go undetected, making it difficult to investigate and stop in real time. For busi...

Full description

Bibliographic Details
Main Authors:	Ammar Yahya Daeef, Ali Al-Naji, Javaan Chahl
Format:	Article
Language:	English
Published:	MDPI AG 2023-11-01
Series:	Telecom
Subjects:	API call sequence statistical classifier model drift malware detection
Online Access:	https://www.mdpi.com/2673-4001/4/4/34

_version_	1797379202937257984
author	Ammar Yahya Daeef Ali Al-Naji Javaan Chahl
author_facet	Ammar Yahya Daeef Ali Al-Naji Javaan Chahl
author_sort	Ammar Yahya Daeef
collection	DOAJ
description	Malware in today’s business world has become a powerful tool used by cyber attackers. It has become more advanced, spreading quickly and causing significant harm. Modern malware is particularly dangerous because it can go undetected, making it difficult to investigate and stop in real time. For businesses, it is vital to ensure that the computer systems are free from malware. To effectively address this problem, the most responsive solution is to operate in real time at the system’s edge. Although machine learning and deep learning have given promising performance for malware detection, the significant challenge is the required processing power and resources for implementation at the system’s edge. Therefore, it is important to prioritize a lightweight approach at the system’s edge. Equally important, the robustness of the model against the concept drift at the system’s edge is crucial to detecting the evolved zero-day malware attacks. Application programming interface (API) calls emerge as the most promising candidate to provide such a solution. However, it is quite challenging to create API call features to achieve a lightweight implementation, high malware detection rate, robustness, and fast execution. This study seeks to investigate and analyze the reuse rate of API calls in both malware and goodware, shedding light on the limitations of API call dictionaries for each class using different datasets. By leveraging these dictionaries, a statistical classifier (STC) is introduced to detect malware samples. Furthermore, the study delves into the investigation of model drift in the STC model, employing entirely distinct datasets for training and testing purposes. The results show the outstanding performance of the STC model in accurately detecting malware, achieving a recall value of one, and exhibiting robustness against model drift. Furthermore, the proposed STC model shows comparable performance to deep learning algorithms, which makes it a strong competitor for performing real-time inference on edge devices.
first_indexed	2024-03-08T20:18:45Z
format	Article
id	doaj.art-0e6be2d571794c94844d62dd1cf99333
institution	Directory Open Access Journal
issn	2673-4001
language	English
last_indexed	2024-03-08T20:18:45Z
publishDate	2023-11-01
publisher	MDPI AG
record_format	Article
series	Telecom
spelling	doaj.art-0e6be2d571794c94844d62dd1cf993332023-12-22T14:45:42ZengMDPI AGTelecom2673-40012023-11-014474675710.3390/telecom4040034Lightweight and Robust Malware Detection Using Dictionaries of API CallsAmmar Yahya Daeef0Ali Al-Naji1Javaan Chahl2Technical Institute for Administration, Middle Technical University, Baghdad 10074, IraqElectrical Engineering Technical College, Middle Technical University, Baghdad 10022, IraqSchool of Engineering, University of South Australia, Mawson Lakes, SA 5095, AustraliaMalware in today’s business world has become a powerful tool used by cyber attackers. It has become more advanced, spreading quickly and causing significant harm. Modern malware is particularly dangerous because it can go undetected, making it difficult to investigate and stop in real time. For businesses, it is vital to ensure that the computer systems are free from malware. To effectively address this problem, the most responsive solution is to operate in real time at the system’s edge. Although machine learning and deep learning have given promising performance for malware detection, the significant challenge is the required processing power and resources for implementation at the system’s edge. Therefore, it is important to prioritize a lightweight approach at the system’s edge. Equally important, the robustness of the model against the concept drift at the system’s edge is crucial to detecting the evolved zero-day malware attacks. Application programming interface (API) calls emerge as the most promising candidate to provide such a solution. However, it is quite challenging to create API call features to achieve a lightweight implementation, high malware detection rate, robustness, and fast execution. This study seeks to investigate and analyze the reuse rate of API calls in both malware and goodware, shedding light on the limitations of API call dictionaries for each class using different datasets. By leveraging these dictionaries, a statistical classifier (STC) is introduced to detect malware samples. Furthermore, the study delves into the investigation of model drift in the STC model, employing entirely distinct datasets for training and testing purposes. The results show the outstanding performance of the STC model in accurately detecting malware, achieving a recall value of one, and exhibiting robustness against model drift. Furthermore, the proposed STC model shows comparable performance to deep learning algorithms, which makes it a strong competitor for performing real-time inference on edge devices.https://www.mdpi.com/2673-4001/4/4/34API call sequencestatistical classifiermodel driftmalware detection
spellingShingle	Ammar Yahya Daeef Ali Al-Naji Javaan Chahl Lightweight and Robust Malware Detection Using Dictionaries of API Calls Telecom API call sequence statistical classifier model drift malware detection
title	Lightweight and Robust Malware Detection Using Dictionaries of API Calls
title_full	Lightweight and Robust Malware Detection Using Dictionaries of API Calls
title_fullStr	Lightweight and Robust Malware Detection Using Dictionaries of API Calls
title_full_unstemmed	Lightweight and Robust Malware Detection Using Dictionaries of API Calls
title_short	Lightweight and Robust Malware Detection Using Dictionaries of API Calls
title_sort	lightweight and robust malware detection using dictionaries of api calls
topic	API call sequence statistical classifier model drift malware detection
url	https://www.mdpi.com/2673-4001/4/4/34
work_keys_str_mv	AT ammaryahyadaeef lightweightandrobustmalwaredetectionusingdictionariesofapicalls AT alialnaji lightweightandrobustmalwaredetectionusingdictionariesofapicalls AT javaanchahl lightweightandrobustmalwaredetectionusingdictionariesofapicalls

Lightweight and Robust Malware Detection Using Dictionaries of API Calls

Similar Items