A Malware Detection Framework Based on Semantic Information of Behavioral Features

As the amount of malware has grown rapidly in recent years, it has become the most dominant attack method in network security. Learning execution behavior, especially Application Programming Interface (API) call sequences, has been shown to be effective for malware detection. However, it is troubles...

Повний опис

Бібліографічні деталі
Автори:	Yuxin Zhang, Shumian Yang, Lijuan Xu, Xin Li, Dawei Zhao
Формат:	Стаття
Мова:	English
Опубліковано:	MDPI AG 2023-11-01
Серія:	Applied Sciences
Предмети:	network security dynamic analysis API sequences deep learning malware detection
Онлайн доступ:	https://www.mdpi.com/2076-3417/13/22/12528

_version_	1827640661793308672
author	Yuxin Zhang Shumian Yang Lijuan Xu Xin Li Dawei Zhao
author_facet	Yuxin Zhang Shumian Yang Lijuan Xu Xin Li Dawei Zhao
author_sort	Yuxin Zhang
collection	DOAJ
description	As the amount of malware has grown rapidly in recent years, it has become the most dominant attack method in network security. Learning execution behavior, especially Application Programming Interface (API) call sequences, has been shown to be effective for malware detection. However, it is troublesome in practice to adequate mining of API call features. Among the current research methods, most of them only analyze single features or inadequately analyze the features, ignoring the analysis of structural and semantic features, which results in information loss and thus affects the accuracy. In order to deal with the problems mentioned above, we propose a novel method of malware detection based on semantic information of behavioral features. First, we preprocess the sequence of API function calls to reduce redundant information. Then, we obtain a vectorized representation of the API call sequence by word embedding model, and encode the API call name by analyzing it to characterize the API name’s semantic structure information and statistical information. Finally, a malware detector consisting of CNN and bidirectional GRU, which can better understand the local and global features between API calls, is used for detection. We evaluate the proposed model in a publicly available dataset provided by a third party. The experimental results show that the proposed method outperforms the baseline method. With this combined neural network architecture, our proposed model attains detection accuracy of 0.9828 and an F1-Score of 0.9827.
first_indexed	2024-03-09T17:02:45Z
format	Article
id	doaj.art-a87c6078c23a4a638f97ce3bd5fcd139
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-09T17:02:45Z
publishDate	2023-11-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-a87c6078c23a4a638f97ce3bd5fcd1392023-11-24T14:28:18ZengMDPI AGApplied Sciences2076-34172023-11-0113221252810.3390/app132212528A Malware Detection Framework Based on Semantic Information of Behavioral FeaturesYuxin Zhang0Shumian Yang1Lijuan Xu2Xin Li3Dawei Zhao4Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, ChinaKey Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, ChinaKey Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, ChinaKey Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, ChinaKey Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, ChinaAs the amount of malware has grown rapidly in recent years, it has become the most dominant attack method in network security. Learning execution behavior, especially Application Programming Interface (API) call sequences, has been shown to be effective for malware detection. However, it is troublesome in practice to adequate mining of API call features. Among the current research methods, most of them only analyze single features or inadequately analyze the features, ignoring the analysis of structural and semantic features, which results in information loss and thus affects the accuracy. In order to deal with the problems mentioned above, we propose a novel method of malware detection based on semantic information of behavioral features. First, we preprocess the sequence of API function calls to reduce redundant information. Then, we obtain a vectorized representation of the API call sequence by word embedding model, and encode the API call name by analyzing it to characterize the API name’s semantic structure information and statistical information. Finally, a malware detector consisting of CNN and bidirectional GRU, which can better understand the local and global features between API calls, is used for detection. We evaluate the proposed model in a publicly available dataset provided by a third party. The experimental results show that the proposed method outperforms the baseline method. With this combined neural network architecture, our proposed model attains detection accuracy of 0.9828 and an F1-Score of 0.9827.https://www.mdpi.com/2076-3417/13/22/12528network securitydynamic analysisAPI sequencesdeep learningmalware detection
spellingShingle	Yuxin Zhang Shumian Yang Lijuan Xu Xin Li Dawei Zhao A Malware Detection Framework Based on Semantic Information of Behavioral Features Applied Sciences network security dynamic analysis API sequences deep learning malware detection
title	A Malware Detection Framework Based on Semantic Information of Behavioral Features
title_full	A Malware Detection Framework Based on Semantic Information of Behavioral Features
title_fullStr	A Malware Detection Framework Based on Semantic Information of Behavioral Features
title_full_unstemmed	A Malware Detection Framework Based on Semantic Information of Behavioral Features
title_short	A Malware Detection Framework Based on Semantic Information of Behavioral Features
title_sort	malware detection framework based on semantic information of behavioral features
topic	network security dynamic analysis API sequences deep learning malware detection
url	https://www.mdpi.com/2076-3417/13/22/12528
work_keys_str_mv	AT yuxinzhang amalwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT shumianyang amalwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT lijuanxu amalwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT xinli amalwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT daweizhao amalwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT yuxinzhang malwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT shumianyang malwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT lijuanxu malwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT xinli malwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT daweizhao malwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures

A Malware Detection Framework Based on Semantic Information of Behavioral Features

Схожі ресурси