Lightweight and Robust Malware Detection Using Dictionaries of API Calls

Malware in today’s business world has become a powerful tool used by cyber attackers. It has become more advanced, spreading quickly and causing significant harm. Modern malware is particularly dangerous because it can go undetected, making it difficult to investigate and stop in real time. For busi...

Full description

Bibliographic Details
Main Authors: Ammar Yahya Daeef, Ali Al-Naji, Javaan Chahl
Format: Article
Language:English
Published: MDPI AG 2023-11-01
Series:Telecom
Subjects:
Online Access:https://www.mdpi.com/2673-4001/4/4/34
_version_ 1797379202937257984
author Ammar Yahya Daeef
Ali Al-Naji
Javaan Chahl
author_facet Ammar Yahya Daeef
Ali Al-Naji
Javaan Chahl
author_sort Ammar Yahya Daeef
collection DOAJ
description Malware in today’s business world has become a powerful tool used by cyber attackers. It has become more advanced, spreading quickly and causing significant harm. Modern malware is particularly dangerous because it can go undetected, making it difficult to investigate and stop in real time. For businesses, it is vital to ensure that the computer systems are free from malware. To effectively address this problem, the most responsive solution is to operate in real time at the system’s edge. Although machine learning and deep learning have given promising performance for malware detection, the significant challenge is the required processing power and resources for implementation at the system’s edge. Therefore, it is important to prioritize a lightweight approach at the system’s edge. Equally important, the robustness of the model against the concept drift at the system’s edge is crucial to detecting the evolved zero-day malware attacks. Application programming interface (API) calls emerge as the most promising candidate to provide such a solution. However, it is quite challenging to create API call features to achieve a lightweight implementation, high malware detection rate, robustness, and fast execution. This study seeks to investigate and analyze the reuse rate of API calls in both malware and goodware, shedding light on the limitations of API call dictionaries for each class using different datasets. By leveraging these dictionaries, a statistical classifier (STC) is introduced to detect malware samples. Furthermore, the study delves into the investigation of model drift in the STC model, employing entirely distinct datasets for training and testing purposes. The results show the outstanding performance of the STC model in accurately detecting malware, achieving a recall value of one, and exhibiting robustness against model drift. Furthermore, the proposed STC model shows comparable performance to deep learning algorithms, which makes it a strong competitor for performing real-time inference on edge devices.
first_indexed 2024-03-08T20:18:45Z
format Article
id doaj.art-0e6be2d571794c94844d62dd1cf99333
institution Directory Open Access Journal
issn 2673-4001
language English
last_indexed 2024-03-08T20:18:45Z
publishDate 2023-11-01
publisher MDPI AG
record_format Article
series Telecom
spelling doaj.art-0e6be2d571794c94844d62dd1cf993332023-12-22T14:45:42ZengMDPI AGTelecom2673-40012023-11-014474675710.3390/telecom4040034Lightweight and Robust Malware Detection Using Dictionaries of API CallsAmmar Yahya Daeef0Ali Al-Naji1Javaan Chahl2Technical Institute for Administration, Middle Technical University, Baghdad 10074, IraqElectrical Engineering Technical College, Middle Technical University, Baghdad 10022, IraqSchool of Engineering, University of South Australia, Mawson Lakes, SA 5095, AustraliaMalware in today’s business world has become a powerful tool used by cyber attackers. It has become more advanced, spreading quickly and causing significant harm. Modern malware is particularly dangerous because it can go undetected, making it difficult to investigate and stop in real time. For businesses, it is vital to ensure that the computer systems are free from malware. To effectively address this problem, the most responsive solution is to operate in real time at the system’s edge. Although machine learning and deep learning have given promising performance for malware detection, the significant challenge is the required processing power and resources for implementation at the system’s edge. Therefore, it is important to prioritize a lightweight approach at the system’s edge. Equally important, the robustness of the model against the concept drift at the system’s edge is crucial to detecting the evolved zero-day malware attacks. Application programming interface (API) calls emerge as the most promising candidate to provide such a solution. However, it is quite challenging to create API call features to achieve a lightweight implementation, high malware detection rate, robustness, and fast execution. This study seeks to investigate and analyze the reuse rate of API calls in both malware and goodware, shedding light on the limitations of API call dictionaries for each class using different datasets. By leveraging these dictionaries, a statistical classifier (STC) is introduced to detect malware samples. Furthermore, the study delves into the investigation of model drift in the STC model, employing entirely distinct datasets for training and testing purposes. The results show the outstanding performance of the STC model in accurately detecting malware, achieving a recall value of one, and exhibiting robustness against model drift. Furthermore, the proposed STC model shows comparable performance to deep learning algorithms, which makes it a strong competitor for performing real-time inference on edge devices.https://www.mdpi.com/2673-4001/4/4/34API call sequencestatistical classifiermodel driftmalware detection
spellingShingle Ammar Yahya Daeef
Ali Al-Naji
Javaan Chahl
Lightweight and Robust Malware Detection Using Dictionaries of API Calls
Telecom
API call sequence
statistical classifier
model drift
malware detection
title Lightweight and Robust Malware Detection Using Dictionaries of API Calls
title_full Lightweight and Robust Malware Detection Using Dictionaries of API Calls
title_fullStr Lightweight and Robust Malware Detection Using Dictionaries of API Calls
title_full_unstemmed Lightweight and Robust Malware Detection Using Dictionaries of API Calls
title_short Lightweight and Robust Malware Detection Using Dictionaries of API Calls
title_sort lightweight and robust malware detection using dictionaries of api calls
topic API call sequence
statistical classifier
model drift
malware detection
url https://www.mdpi.com/2673-4001/4/4/34
work_keys_str_mv AT ammaryahyadaeef lightweightandrobustmalwaredetectionusingdictionariesofapicalls
AT alialnaji lightweightandrobustmalwaredetectionusingdictionariesofapicalls
AT javaanchahl lightweightandrobustmalwaredetectionusingdictionariesofapicalls