A Feasibility Study on Evasion Attacks Against NLP-Based Macro Malware Detection Algorithms

Machine learning-based models for malware detection have gained prominence in order to detect obfuscated malware. These models extract malicious features and endeavor to classify samples as either malware or benign entities. Conversely, these benign features can be employed to imitate benign samples...

Full description

Bibliographic Details
Main Authors: Mamoru Mimura, Risa Yamamoto
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10345584/
_version_ 1797376371373113344
author Mamoru Mimura
Risa Yamamoto
author_facet Mamoru Mimura
Risa Yamamoto
author_sort Mamoru Mimura
collection DOAJ
description Machine learning-based models for malware detection have gained prominence in order to detect obfuscated malware. These models extract malicious features and endeavor to classify samples as either malware or benign entities. Conversely, these benign features can be employed to imitate benign samples. With respect to Android applications, numerous researchers have assessed the hazard and tackled the problem. This evasive technique can be extended to other malicious scripts, such as macro malware. In this paper, we investigate the potential for evasive attacks against natural language processing (NLP)-based macro malware detection algorithms. We assess three language models as methods for feature extraction: Bag of Words, Latent Semantic Analysis, and Paragraph Vector. Our experimental result demonstrates that the detection rate declines to 2 percent when benign features are inserted into actual macro malware. This approach is effective even against advanced language models.
first_indexed 2024-03-08T19:37:35Z
format Article
id doaj.art-b3e8c39be6b94290a953b72b60fe1836
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-08T19:37:35Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-b3e8c39be6b94290a953b72b60fe18362023-12-26T00:08:03ZengIEEEIEEE Access2169-35362023-01-011113833613834610.1109/ACCESS.2023.333982710345584A Feasibility Study on Evasion Attacks Against NLP-Based Macro Malware Detection AlgorithmsMamoru Mimura0https://orcid.org/0000-0003-4323-9911Risa Yamamoto1National Defense Academy of Japan, Yokosuka, JapanJapan Ground Self-Defense Force, Shinjuku-ku, JapanMachine learning-based models for malware detection have gained prominence in order to detect obfuscated malware. These models extract malicious features and endeavor to classify samples as either malware or benign entities. Conversely, these benign features can be employed to imitate benign samples. With respect to Android applications, numerous researchers have assessed the hazard and tackled the problem. This evasive technique can be extended to other malicious scripts, such as macro malware. In this paper, we investigate the potential for evasive attacks against natural language processing (NLP)-based macro malware detection algorithms. We assess three language models as methods for feature extraction: Bag of Words, Latent Semantic Analysis, and Paragraph Vector. Our experimental result demonstrates that the detection rate declines to 2 percent when benign features are inserted into actual macro malware. This approach is effective even against advanced language models.https://ieeexplore.ieee.org/document/10345584/Macro malwaremachine learningevasion attackLSAparagraph vector
spellingShingle Mamoru Mimura
Risa Yamamoto
A Feasibility Study on Evasion Attacks Against NLP-Based Macro Malware Detection Algorithms
IEEE Access
Macro malware
machine learning
evasion attack
LSA
paragraph vector
title A Feasibility Study on Evasion Attacks Against NLP-Based Macro Malware Detection Algorithms
title_full A Feasibility Study on Evasion Attacks Against NLP-Based Macro Malware Detection Algorithms
title_fullStr A Feasibility Study on Evasion Attacks Against NLP-Based Macro Malware Detection Algorithms
title_full_unstemmed A Feasibility Study on Evasion Attacks Against NLP-Based Macro Malware Detection Algorithms
title_short A Feasibility Study on Evasion Attacks Against NLP-Based Macro Malware Detection Algorithms
title_sort feasibility study on evasion attacks against nlp based macro malware detection algorithms
topic Macro malware
machine learning
evasion attack
LSA
paragraph vector
url https://ieeexplore.ieee.org/document/10345584/
work_keys_str_mv AT mamorumimura afeasibilitystudyonevasionattacksagainstnlpbasedmacromalwaredetectionalgorithms
AT risayamamoto afeasibilitystudyonevasionattacksagainstnlpbasedmacromalwaredetectionalgorithms
AT mamorumimura feasibilitystudyonevasionattacksagainstnlpbasedmacromalwaredetectionalgorithms
AT risayamamoto feasibilitystudyonevasionattacksagainstnlpbasedmacromalwaredetectionalgorithms