A Universal Malicious Documents Static Detection Framework Based on Feature Generalization
In this study, Portable Document Format (PDF), Word, Excel, Rich Test format (RTF) and image documents are taken as the research objects to study a static and fast method by which to detect malicious documents. Malicious PDF and Word document features are abstracted and extended, which can be used t...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-12-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/11/24/12134 |
_version_ | 1797506697821945856 |
---|---|
author | Xiaofeng Lu Fei Wang Cheng Jiang Pietro Lio |
author_facet | Xiaofeng Lu Fei Wang Cheng Jiang Pietro Lio |
author_sort | Xiaofeng Lu |
collection | DOAJ |
description | In this study, Portable Document Format (PDF), Word, Excel, Rich Test format (RTF) and image documents are taken as the research objects to study a static and fast method by which to detect malicious documents. Malicious PDF and Word document features are abstracted and extended, which can be used to detect other types of documents. A universal static detection framework for malicious documents based on feature generalization is then proposed. The generalized features include specification check errors, the structure path, code keywords, and the number of objects. The proposed method is verified on two datasets, and is compared with Kaspersky, NOD32, and McAfee antivirus software. The experimental results demonstrate that the proposed method achieves good performance in terms of the detection accuracy, runtime, and scalability. The average F1-score of all types of documents is found to be 0.99, and the average detection time of a document is 0.5926 s, which is at the same level as the compared antivirus software. |
first_indexed | 2024-03-10T04:37:10Z |
format | Article |
id | doaj.art-f55ead5279534ab88f4085a0ce4ccaac |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T04:37:10Z |
publishDate | 2021-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-f55ead5279534ab88f4085a0ce4ccaac2023-11-23T03:43:43ZengMDPI AGApplied Sciences2076-34172021-12-0111241213410.3390/app112412134A Universal Malicious Documents Static Detection Framework Based on Feature GeneralizationXiaofeng Lu0Fei Wang1Cheng Jiang2Pietro Lio3School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, ChinaSchool of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, ChinaSchool of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, ChinaComputer Laboratory, University of Cambridge, Cambridge CB3 0FD, UKIn this study, Portable Document Format (PDF), Word, Excel, Rich Test format (RTF) and image documents are taken as the research objects to study a static and fast method by which to detect malicious documents. Malicious PDF and Word document features are abstracted and extended, which can be used to detect other types of documents. A universal static detection framework for malicious documents based on feature generalization is then proposed. The generalized features include specification check errors, the structure path, code keywords, and the number of objects. The proposed method is verified on two datasets, and is compared with Kaspersky, NOD32, and McAfee antivirus software. The experimental results demonstrate that the proposed method achieves good performance in terms of the detection accuracy, runtime, and scalability. The average F1-score of all types of documents is found to be 0.99, and the average detection time of a document is 0.5926 s, which is at the same level as the compared antivirus software.https://www.mdpi.com/2076-3417/11/24/12134malicious document detectionstatic detectionfeature generalizationmachine learning |
spellingShingle | Xiaofeng Lu Fei Wang Cheng Jiang Pietro Lio A Universal Malicious Documents Static Detection Framework Based on Feature Generalization Applied Sciences malicious document detection static detection feature generalization machine learning |
title | A Universal Malicious Documents Static Detection Framework Based on Feature Generalization |
title_full | A Universal Malicious Documents Static Detection Framework Based on Feature Generalization |
title_fullStr | A Universal Malicious Documents Static Detection Framework Based on Feature Generalization |
title_full_unstemmed | A Universal Malicious Documents Static Detection Framework Based on Feature Generalization |
title_short | A Universal Malicious Documents Static Detection Framework Based on Feature Generalization |
title_sort | universal malicious documents static detection framework based on feature generalization |
topic | malicious document detection static detection feature generalization machine learning |
url | https://www.mdpi.com/2076-3417/11/24/12134 |
work_keys_str_mv | AT xiaofenglu auniversalmaliciousdocumentsstaticdetectionframeworkbasedonfeaturegeneralization AT feiwang auniversalmaliciousdocumentsstaticdetectionframeworkbasedonfeaturegeneralization AT chengjiang auniversalmaliciousdocumentsstaticdetectionframeworkbasedonfeaturegeneralization AT pietrolio auniversalmaliciousdocumentsstaticdetectionframeworkbasedonfeaturegeneralization AT xiaofenglu universalmaliciousdocumentsstaticdetectionframeworkbasedonfeaturegeneralization AT feiwang universalmaliciousdocumentsstaticdetectionframeworkbasedonfeaturegeneralization AT chengjiang universalmaliciousdocumentsstaticdetectionframeworkbasedonfeaturegeneralization AT pietrolio universalmaliciousdocumentsstaticdetectionframeworkbasedonfeaturegeneralization |