A Malware Detection Framework Based on Semantic Information of Behavioral Features
As the amount of malware has grown rapidly in recent years, it has become the most dominant attack method in network security. Learning execution behavior, especially Application Programming Interface (API) call sequences, has been shown to be effective for malware detection. However, it is troubles...
Автори: | , , , , |
---|---|
Формат: | Стаття |
Мова: | English |
Опубліковано: |
MDPI AG
2023-11-01
|
Серія: | Applied Sciences |
Предмети: | |
Онлайн доступ: | https://www.mdpi.com/2076-3417/13/22/12528 |
_version_ | 1827640661793308672 |
---|---|
author | Yuxin Zhang Shumian Yang Lijuan Xu Xin Li Dawei Zhao |
author_facet | Yuxin Zhang Shumian Yang Lijuan Xu Xin Li Dawei Zhao |
author_sort | Yuxin Zhang |
collection | DOAJ |
description | As the amount of malware has grown rapidly in recent years, it has become the most dominant attack method in network security. Learning execution behavior, especially Application Programming Interface (API) call sequences, has been shown to be effective for malware detection. However, it is troublesome in practice to adequate mining of API call features. Among the current research methods, most of them only analyze single features or inadequately analyze the features, ignoring the analysis of structural and semantic features, which results in information loss and thus affects the accuracy. In order to deal with the problems mentioned above, we propose a novel method of malware detection based on semantic information of behavioral features. First, we preprocess the sequence of API function calls to reduce redundant information. Then, we obtain a vectorized representation of the API call sequence by word embedding model, and encode the API call name by analyzing it to characterize the API name’s semantic structure information and statistical information. Finally, a malware detector consisting of CNN and bidirectional GRU, which can better understand the local and global features between API calls, is used for detection. We evaluate the proposed model in a publicly available dataset provided by a third party. The experimental results show that the proposed method outperforms the baseline method. With this combined neural network architecture, our proposed model attains detection accuracy of 0.9828 and an F1-Score of 0.9827. |
first_indexed | 2024-03-09T17:02:45Z |
format | Article |
id | doaj.art-a87c6078c23a4a638f97ce3bd5fcd139 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-09T17:02:45Z |
publishDate | 2023-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-a87c6078c23a4a638f97ce3bd5fcd1392023-11-24T14:28:18ZengMDPI AGApplied Sciences2076-34172023-11-0113221252810.3390/app132212528A Malware Detection Framework Based on Semantic Information of Behavioral FeaturesYuxin Zhang0Shumian Yang1Lijuan Xu2Xin Li3Dawei Zhao4Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, ChinaKey Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, ChinaKey Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, ChinaKey Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, ChinaKey Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, ChinaAs the amount of malware has grown rapidly in recent years, it has become the most dominant attack method in network security. Learning execution behavior, especially Application Programming Interface (API) call sequences, has been shown to be effective for malware detection. However, it is troublesome in practice to adequate mining of API call features. Among the current research methods, most of them only analyze single features or inadequately analyze the features, ignoring the analysis of structural and semantic features, which results in information loss and thus affects the accuracy. In order to deal with the problems mentioned above, we propose a novel method of malware detection based on semantic information of behavioral features. First, we preprocess the sequence of API function calls to reduce redundant information. Then, we obtain a vectorized representation of the API call sequence by word embedding model, and encode the API call name by analyzing it to characterize the API name’s semantic structure information and statistical information. Finally, a malware detector consisting of CNN and bidirectional GRU, which can better understand the local and global features between API calls, is used for detection. We evaluate the proposed model in a publicly available dataset provided by a third party. The experimental results show that the proposed method outperforms the baseline method. With this combined neural network architecture, our proposed model attains detection accuracy of 0.9828 and an F1-Score of 0.9827.https://www.mdpi.com/2076-3417/13/22/12528network securitydynamic analysisAPI sequencesdeep learningmalware detection |
spellingShingle | Yuxin Zhang Shumian Yang Lijuan Xu Xin Li Dawei Zhao A Malware Detection Framework Based on Semantic Information of Behavioral Features Applied Sciences network security dynamic analysis API sequences deep learning malware detection |
title | A Malware Detection Framework Based on Semantic Information of Behavioral Features |
title_full | A Malware Detection Framework Based on Semantic Information of Behavioral Features |
title_fullStr | A Malware Detection Framework Based on Semantic Information of Behavioral Features |
title_full_unstemmed | A Malware Detection Framework Based on Semantic Information of Behavioral Features |
title_short | A Malware Detection Framework Based on Semantic Information of Behavioral Features |
title_sort | malware detection framework based on semantic information of behavioral features |
topic | network security dynamic analysis API sequences deep learning malware detection |
url | https://www.mdpi.com/2076-3417/13/22/12528 |
work_keys_str_mv | AT yuxinzhang amalwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT shumianyang amalwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT lijuanxu amalwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT xinli amalwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT daweizhao amalwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT yuxinzhang malwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT shumianyang malwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT lijuanxu malwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT xinli malwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures AT daweizhao malwaredetectionframeworkbasedonsemanticinformationofbehavioralfeatures |