PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers

In recent years, PowerShell has become the common tool that helps attackers launch targeted attacks using living-off-the-land tactics and fileless attack techniques. Unfortunately, malware-derived PowerShell Commands (PSCmds) have typically been obfuscated to hide the malicious intent from detection...

Full description

Bibliographic Details
Main Authors:	Meng-Han Tsai, Chia-Ching Lin, Zheng-Gang He, Wei-Chieh Yang, Chin-Laung Lei
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	PowerShell de-obfuscation machine learning deep learning abstract syntax trees multi-label classification
Online Access:	https://ieeexplore.ieee.org/document/9999441/

_version_	1797962631175208960
author	Meng-Han Tsai Chia-Ching Lin Zheng-Gang He Wei-Chieh Yang Chin-Laung Lei
author_facet	Meng-Han Tsai Chia-Ching Lin Zheng-Gang He Wei-Chieh Yang Chin-Laung Lei
author_sort	Meng-Han Tsai
collection	DOAJ
description	In recent years, PowerShell has become the common tool that helps attackers launch targeted attacks using living-off-the-land tactics and fileless attack techniques. Unfortunately, malware-derived PowerShell Commands (PSCmds) have typically been obfuscated to hide the malicious intent from detection and analysis. Also, malicious PSCmds’ expansive use of multiple obfuscation strategies and encryption methods makes them difficult to be revealed. Despite the advances in malicious PSCmds detection incorporating new approaches such as machine learning and deep learning, there is still no consensus on the solution to de-obfuscating malicious PSCmds and profiling their behavior. To address this challenge, we propose a hybrid framework that combines deep learning and program analysis for automatic PowerShell De-obfuscation and behavioral Profiling (PowerDP) through multi-label classification in a static manner. First, we use character distribution features to forecast obfuscation types of malicious PSCmds. Second, we developed an extensive de-obfuscator utilizing static regular expression replacement to recover the original content of obfuscated PSCmds based on the predicted obfuscation types. Finally, we profile the behavior of PSCmds by features extracted from the abstract syntax tree of PSCmds after de-obfuscation. Our results show that PowerDP achieves a promising 99.82% accuracy and 0.18% hamming loss in obfuscation multi-label classification using deep learning. Furthermore, the successful recovery rate of the de-obfuscator against 15 obfuscation types is 98.11% on average with semantic similarity comparison, and the accuracy of the behavior multi-label classification for identifying 5 behaviors in malicious PSCmds averages 98.53%. The evaluation indicates that PowerDP is able to classify and profile complicated PSCmds.
first_indexed	2024-04-11T01:17:20Z
format	Article
id	doaj.art-613997a7f3a947c7a4bcf8d2783baf96
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-04-11T01:17:20Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-613997a7f3a947c7a4bcf8d2783baf962023-01-04T00:00:15ZengIEEEIEEE Access2169-35362023-01-011125627010.1109/ACCESS.2022.32325059999441PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label ClassifiersMeng-Han Tsai0https://orcid.org/0000-0002-0636-4716Chia-Ching Lin1https://orcid.org/0000-0003-2779-6486Zheng-Gang He2Wei-Chieh Yang3Chin-Laung Lei4https://orcid.org/0000-0002-9011-5025Graduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanGraduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanGraduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanGraduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanGraduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanIn recent years, PowerShell has become the common tool that helps attackers launch targeted attacks using living-off-the-land tactics and fileless attack techniques. Unfortunately, malware-derived PowerShell Commands (PSCmds) have typically been obfuscated to hide the malicious intent from detection and analysis. Also, malicious PSCmds’ expansive use of multiple obfuscation strategies and encryption methods makes them difficult to be revealed. Despite the advances in malicious PSCmds detection incorporating new approaches such as machine learning and deep learning, there is still no consensus on the solution to de-obfuscating malicious PSCmds and profiling their behavior. To address this challenge, we propose a hybrid framework that combines deep learning and program analysis for automatic PowerShell De-obfuscation and behavioral Profiling (PowerDP) through multi-label classification in a static manner. First, we use character distribution features to forecast obfuscation types of malicious PSCmds. Second, we developed an extensive de-obfuscator utilizing static regular expression replacement to recover the original content of obfuscated PSCmds based on the predicted obfuscation types. Finally, we profile the behavior of PSCmds by features extracted from the abstract syntax tree of PSCmds after de-obfuscation. Our results show that PowerDP achieves a promising 99.82% accuracy and 0.18% hamming loss in obfuscation multi-label classification using deep learning. Furthermore, the successful recovery rate of the de-obfuscator against 15 obfuscation types is 98.11% on average with semantic similarity comparison, and the accuracy of the behavior multi-label classification for identifying 5 behaviors in malicious PSCmds averages 98.53%. The evaluation indicates that PowerDP is able to classify and profile complicated PSCmds.https://ieeexplore.ieee.org/document/9999441/PowerShellde-obfuscationmachine learningdeep learningabstract syntax treesmulti-label classification
spellingShingle	Meng-Han Tsai Chia-Ching Lin Zheng-Gang He Wei-Chieh Yang Chin-Laung Lei PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers IEEE Access PowerShell de-obfuscation machine learning deep learning abstract syntax trees multi-label classification
title	PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers
title_full	PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers
title_fullStr	PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers
title_full_unstemmed	PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers
title_short	PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers
title_sort	powerdp de obfuscating and profiling malicious powershell commands with multi label classifiers
topic	PowerShell de-obfuscation machine learning deep learning abstract syntax trees multi-label classification
url	https://ieeexplore.ieee.org/document/9999441/
work_keys_str_mv	AT menghantsai powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers AT chiachinglin powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers AT zhengganghe powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers AT weichiehyang powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers AT chinlaunglei powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers

PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers

Similar Items