PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers
In recent years, PowerShell has become the common tool that helps attackers launch targeted attacks using living-off-the-land tactics and fileless attack techniques. Unfortunately, malware-derived PowerShell Commands (PSCmds) have typically been obfuscated to hide the malicious intent from detection...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9999441/ |
_version_ | 1797962631175208960 |
---|---|
author | Meng-Han Tsai Chia-Ching Lin Zheng-Gang He Wei-Chieh Yang Chin-Laung Lei |
author_facet | Meng-Han Tsai Chia-Ching Lin Zheng-Gang He Wei-Chieh Yang Chin-Laung Lei |
author_sort | Meng-Han Tsai |
collection | DOAJ |
description | In recent years, PowerShell has become the common tool that helps attackers launch targeted attacks using living-off-the-land tactics and fileless attack techniques. Unfortunately, malware-derived PowerShell Commands (PSCmds) have typically been obfuscated to hide the malicious intent from detection and analysis. Also, malicious PSCmds’ expansive use of multiple obfuscation strategies and encryption methods makes them difficult to be revealed. Despite the advances in malicious PSCmds detection incorporating new approaches such as machine learning and deep learning, there is still no consensus on the solution to de-obfuscating malicious PSCmds and profiling their behavior. To address this challenge, we propose a hybrid framework that combines deep learning and program analysis for automatic PowerShell De-obfuscation and behavioral Profiling (PowerDP) through multi-label classification in a static manner. First, we use character distribution features to forecast obfuscation types of malicious PSCmds. Second, we developed an extensive de-obfuscator utilizing static regular expression replacement to recover the original content of obfuscated PSCmds based on the predicted obfuscation types. Finally, we profile the behavior of PSCmds by features extracted from the abstract syntax tree of PSCmds after de-obfuscation. Our results show that PowerDP achieves a promising 99.82% accuracy and 0.18% hamming loss in obfuscation multi-label classification using deep learning. Furthermore, the successful recovery rate of the de-obfuscator against 15 obfuscation types is 98.11% on average with semantic similarity comparison, and the accuracy of the behavior multi-label classification for identifying 5 behaviors in malicious PSCmds averages 98.53%. The evaluation indicates that PowerDP is able to classify and profile complicated PSCmds. |
first_indexed | 2024-04-11T01:17:20Z |
format | Article |
id | doaj.art-613997a7f3a947c7a4bcf8d2783baf96 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-11T01:17:20Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-613997a7f3a947c7a4bcf8d2783baf962023-01-04T00:00:15ZengIEEEIEEE Access2169-35362023-01-011125627010.1109/ACCESS.2022.32325059999441PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label ClassifiersMeng-Han Tsai0https://orcid.org/0000-0002-0636-4716Chia-Ching Lin1https://orcid.org/0000-0003-2779-6486Zheng-Gang He2Wei-Chieh Yang3Chin-Laung Lei4https://orcid.org/0000-0002-9011-5025Graduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanGraduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanGraduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanGraduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanGraduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanIn recent years, PowerShell has become the common tool that helps attackers launch targeted attacks using living-off-the-land tactics and fileless attack techniques. Unfortunately, malware-derived PowerShell Commands (PSCmds) have typically been obfuscated to hide the malicious intent from detection and analysis. Also, malicious PSCmds’ expansive use of multiple obfuscation strategies and encryption methods makes them difficult to be revealed. Despite the advances in malicious PSCmds detection incorporating new approaches such as machine learning and deep learning, there is still no consensus on the solution to de-obfuscating malicious PSCmds and profiling their behavior. To address this challenge, we propose a hybrid framework that combines deep learning and program analysis for automatic PowerShell De-obfuscation and behavioral Profiling (PowerDP) through multi-label classification in a static manner. First, we use character distribution features to forecast obfuscation types of malicious PSCmds. Second, we developed an extensive de-obfuscator utilizing static regular expression replacement to recover the original content of obfuscated PSCmds based on the predicted obfuscation types. Finally, we profile the behavior of PSCmds by features extracted from the abstract syntax tree of PSCmds after de-obfuscation. Our results show that PowerDP achieves a promising 99.82% accuracy and 0.18% hamming loss in obfuscation multi-label classification using deep learning. Furthermore, the successful recovery rate of the de-obfuscator against 15 obfuscation types is 98.11% on average with semantic similarity comparison, and the accuracy of the behavior multi-label classification for identifying 5 behaviors in malicious PSCmds averages 98.53%. The evaluation indicates that PowerDP is able to classify and profile complicated PSCmds.https://ieeexplore.ieee.org/document/9999441/PowerShellde-obfuscationmachine learningdeep learningabstract syntax treesmulti-label classification |
spellingShingle | Meng-Han Tsai Chia-Ching Lin Zheng-Gang He Wei-Chieh Yang Chin-Laung Lei PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers IEEE Access PowerShell de-obfuscation machine learning deep learning abstract syntax trees multi-label classification |
title | PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers |
title_full | PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers |
title_fullStr | PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers |
title_full_unstemmed | PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers |
title_short | PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers |
title_sort | powerdp de obfuscating and profiling malicious powershell commands with multi label classifiers |
topic | PowerShell de-obfuscation machine learning deep learning abstract syntax trees multi-label classification |
url | https://ieeexplore.ieee.org/document/9999441/ |
work_keys_str_mv | AT menghantsai powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers AT chiachinglin powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers AT zhengganghe powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers AT weichiehyang powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers AT chinlaunglei powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers |