PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers

In recent years, PowerShell has become the common tool that helps attackers launch targeted attacks using living-off-the-land tactics and fileless attack techniques. Unfortunately, malware-derived PowerShell Commands (PSCmds) have typically been obfuscated to hide the malicious intent from detection...

Full description

Bibliographic Details
Main Authors: Meng-Han Tsai, Chia-Ching Lin, Zheng-Gang He, Wei-Chieh Yang, Chin-Laung Lei
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9999441/
_version_ 1797962631175208960
author Meng-Han Tsai
Chia-Ching Lin
Zheng-Gang He
Wei-Chieh Yang
Chin-Laung Lei
author_facet Meng-Han Tsai
Chia-Ching Lin
Zheng-Gang He
Wei-Chieh Yang
Chin-Laung Lei
author_sort Meng-Han Tsai
collection DOAJ
description In recent years, PowerShell has become the common tool that helps attackers launch targeted attacks using living-off-the-land tactics and fileless attack techniques. Unfortunately, malware-derived PowerShell Commands (PSCmds) have typically been obfuscated to hide the malicious intent from detection and analysis. Also, malicious PSCmds’ expansive use of multiple obfuscation strategies and encryption methods makes them difficult to be revealed. Despite the advances in malicious PSCmds detection incorporating new approaches such as machine learning and deep learning, there is still no consensus on the solution to de-obfuscating malicious PSCmds and profiling their behavior. To address this challenge, we propose a hybrid framework that combines deep learning and program analysis for automatic PowerShell De-obfuscation and behavioral Profiling (PowerDP) through multi-label classification in a static manner. First, we use character distribution features to forecast obfuscation types of malicious PSCmds. Second, we developed an extensive de-obfuscator utilizing static regular expression replacement to recover the original content of obfuscated PSCmds based on the predicted obfuscation types. Finally, we profile the behavior of PSCmds by features extracted from the abstract syntax tree of PSCmds after de-obfuscation. Our results show that PowerDP achieves a promising 99.82% accuracy and 0.18% hamming loss in obfuscation multi-label classification using deep learning. Furthermore, the successful recovery rate of the de-obfuscator against 15 obfuscation types is 98.11% on average with semantic similarity comparison, and the accuracy of the behavior multi-label classification for identifying 5 behaviors in malicious PSCmds averages 98.53%. The evaluation indicates that PowerDP is able to classify and profile complicated PSCmds.
first_indexed 2024-04-11T01:17:20Z
format Article
id doaj.art-613997a7f3a947c7a4bcf8d2783baf96
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-11T01:17:20Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-613997a7f3a947c7a4bcf8d2783baf962023-01-04T00:00:15ZengIEEEIEEE Access2169-35362023-01-011125627010.1109/ACCESS.2022.32325059999441PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label ClassifiersMeng-Han Tsai0https://orcid.org/0000-0002-0636-4716Chia-Ching Lin1https://orcid.org/0000-0003-2779-6486Zheng-Gang He2Wei-Chieh Yang3Chin-Laung Lei4https://orcid.org/0000-0002-9011-5025Graduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanGraduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanGraduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanGraduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanGraduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanIn recent years, PowerShell has become the common tool that helps attackers launch targeted attacks using living-off-the-land tactics and fileless attack techniques. Unfortunately, malware-derived PowerShell Commands (PSCmds) have typically been obfuscated to hide the malicious intent from detection and analysis. Also, malicious PSCmds’ expansive use of multiple obfuscation strategies and encryption methods makes them difficult to be revealed. Despite the advances in malicious PSCmds detection incorporating new approaches such as machine learning and deep learning, there is still no consensus on the solution to de-obfuscating malicious PSCmds and profiling their behavior. To address this challenge, we propose a hybrid framework that combines deep learning and program analysis for automatic PowerShell De-obfuscation and behavioral Profiling (PowerDP) through multi-label classification in a static manner. First, we use character distribution features to forecast obfuscation types of malicious PSCmds. Second, we developed an extensive de-obfuscator utilizing static regular expression replacement to recover the original content of obfuscated PSCmds based on the predicted obfuscation types. Finally, we profile the behavior of PSCmds by features extracted from the abstract syntax tree of PSCmds after de-obfuscation. Our results show that PowerDP achieves a promising 99.82% accuracy and 0.18% hamming loss in obfuscation multi-label classification using deep learning. Furthermore, the successful recovery rate of the de-obfuscator against 15 obfuscation types is 98.11% on average with semantic similarity comparison, and the accuracy of the behavior multi-label classification for identifying 5 behaviors in malicious PSCmds averages 98.53%. The evaluation indicates that PowerDP is able to classify and profile complicated PSCmds.https://ieeexplore.ieee.org/document/9999441/PowerShellde-obfuscationmachine learningdeep learningabstract syntax treesmulti-label classification
spellingShingle Meng-Han Tsai
Chia-Ching Lin
Zheng-Gang He
Wei-Chieh Yang
Chin-Laung Lei
PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers
IEEE Access
PowerShell
de-obfuscation
machine learning
deep learning
abstract syntax trees
multi-label classification
title PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers
title_full PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers
title_fullStr PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers
title_full_unstemmed PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers
title_short PowerDP: De-Obfuscating and Profiling Malicious PowerShell Commands With Multi-Label Classifiers
title_sort powerdp de obfuscating and profiling malicious powershell commands with multi label classifiers
topic PowerShell
de-obfuscation
machine learning
deep learning
abstract syntax trees
multi-label classification
url https://ieeexplore.ieee.org/document/9999441/
work_keys_str_mv AT menghantsai powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers
AT chiachinglin powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers
AT zhengganghe powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers
AT weichiehyang powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers
AT chinlaunglei powerdpdeobfuscatingandprofilingmaliciouspowershellcommandswithmultilabelclassifiers