Impact of Code Deobfuscation and Feature Interaction in Android Malware Detection
With more than three million applications already in the Android marketplace, various malware detection systems based on machine learning have been proposed to prevent attacks from cybercriminals; most of these systems use static analyses to extract application features. However, many features gener...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9529344/ |
_version_ | 1818647710764367872 |
---|---|
author | Yun-Chung Chen Hong-Yen Chen Takeshi Takahashi Bo Sun Tsung-Nan Lin |
author_facet | Yun-Chung Chen Hong-Yen Chen Takeshi Takahashi Bo Sun Tsung-Nan Lin |
author_sort | Yun-Chung Chen |
collection | DOAJ |
description | With more than three million applications already in the Android marketplace, various malware detection systems based on machine learning have been proposed to prevent attacks from cybercriminals; most of these systems use static analyses to extract application features. However, many features generated by static analyses can be easily thwarted by obfuscation techniques. Therefore, several researchers have addressed this obfuscation problem with obfuscation-invariant features. However, to the best of our knowledge, no researcher has utilized deobfuscation techniques. To this end, we adopt a code deobfuscation technique with an Android malware detection system and investigate its effects. Experimental results indicate that code deobfuscation can successfully retrieve useful information concealed by obfuscation. Further, we propose interaction terms based on identified feature interactions. The proposed interaction terms aim to eliminate the interference caused by the size of the application and other features because many feature values are correlated to the size of the application. In addition, the experimental results indicate that these interaction terms have a high ranking in terms of feature importance values. Our proposed Android malware detection model achieves 99.55% accuracy and a 94.61% F1-score with the well-known Drebin dataset, which is better than the performance of previous works. |
first_indexed | 2024-12-17T01:06:52Z |
format | Article |
id | doaj.art-5badca29484f45d6a833f0755e01f54f |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-17T01:06:52Z |
publishDate | 2021-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-5badca29484f45d6a833f0755e01f54f2022-12-21T22:09:14ZengIEEEIEEE Access2169-35362021-01-01912320812321910.1109/ACCESS.2021.31104089529344Impact of Code Deobfuscation and Feature Interaction in Android Malware DetectionYun-Chung Chen0https://orcid.org/0000-0003-4207-5695Hong-Yen Chen1https://orcid.org/0000-0001-5638-8030Takeshi Takahashi2https://orcid.org/0000-0002-6477-7770Bo Sun3https://orcid.org/0000-0002-7822-3672Tsung-Nan Lin4https://orcid.org/0000-0001-5659-1194Graduate Institute of Electrical Engineering, National Taiwan University, Taipei, TaiwanGraduate Institute of Communication Engineering, National Taiwan University, Taipei, TaiwanNational Institute of Information and Communications Technology, Koganei, Tokyo, JapanNational Institute of Information and Communications Technology, Koganei, Tokyo, JapanDepartment of Electrical Engineering, National Taiwan University, Taipei, TaiwanWith more than three million applications already in the Android marketplace, various malware detection systems based on machine learning have been proposed to prevent attacks from cybercriminals; most of these systems use static analyses to extract application features. However, many features generated by static analyses can be easily thwarted by obfuscation techniques. Therefore, several researchers have addressed this obfuscation problem with obfuscation-invariant features. However, to the best of our knowledge, no researcher has utilized deobfuscation techniques. To this end, we adopt a code deobfuscation technique with an Android malware detection system and investigate its effects. Experimental results indicate that code deobfuscation can successfully retrieve useful information concealed by obfuscation. Further, we propose interaction terms based on identified feature interactions. The proposed interaction terms aim to eliminate the interference caused by the size of the application and other features because many feature values are correlated to the size of the application. In addition, the experimental results indicate that these interaction terms have a high ranking in terms of feature importance values. Our proposed Android malware detection model achieves 99.55% accuracy and a 94.61% F1-score with the well-known Drebin dataset, which is better than the performance of previous works.https://ieeexplore.ieee.org/document/9529344/Android malware detectionclassificationcode deobfuscationfeature interactionmachine learningstatic analysis |
spellingShingle | Yun-Chung Chen Hong-Yen Chen Takeshi Takahashi Bo Sun Tsung-Nan Lin Impact of Code Deobfuscation and Feature Interaction in Android Malware Detection IEEE Access Android malware detection classification code deobfuscation feature interaction machine learning static analysis |
title | Impact of Code Deobfuscation and Feature Interaction in Android Malware Detection |
title_full | Impact of Code Deobfuscation and Feature Interaction in Android Malware Detection |
title_fullStr | Impact of Code Deobfuscation and Feature Interaction in Android Malware Detection |
title_full_unstemmed | Impact of Code Deobfuscation and Feature Interaction in Android Malware Detection |
title_short | Impact of Code Deobfuscation and Feature Interaction in Android Malware Detection |
title_sort | impact of code deobfuscation and feature interaction in android malware detection |
topic | Android malware detection classification code deobfuscation feature interaction machine learning static analysis |
url | https://ieeexplore.ieee.org/document/9529344/ |
work_keys_str_mv | AT yunchungchen impactofcodedeobfuscationandfeatureinteractioninandroidmalwaredetection AT hongyenchen impactofcodedeobfuscationandfeatureinteractioninandroidmalwaredetection AT takeshitakahashi impactofcodedeobfuscationandfeatureinteractioninandroidmalwaredetection AT bosun impactofcodedeobfuscationandfeatureinteractioninandroidmalwaredetection AT tsungnanlin impactofcodedeobfuscationandfeatureinteractioninandroidmalwaredetection |