Explainable Machine Learning for Malware Detection on Android Applications
The presence of malicious software (malware), for example, in Android applications (apps), has harmful or irreparable consequences to the user and/or the device. Despite the protections app stores provide to avoid malware, it keeps growing in sophistication and diffusion. In this paper, we explore t...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-01-01
|
Series: | Information |
Subjects: | |
Online Access: | https://www.mdpi.com/2078-2489/15/1/25 |
_version_ | 1797343391721193472 |
---|---|
author | Catarina Palma Artur Ferreira Mário Figueiredo |
author_facet | Catarina Palma Artur Ferreira Mário Figueiredo |
author_sort | Catarina Palma |
collection | DOAJ |
description | The presence of malicious software (malware), for example, in Android applications (apps), has harmful or irreparable consequences to the user and/or the device. Despite the protections app stores provide to avoid malware, it keeps growing in sophistication and diffusion. In this paper, we explore the use of machine learning (ML) techniques to detect malware in Android apps. The focus is on the study of different data pre-processing, dimensionality reduction, and classification techniques, assessing the generalization ability of the learned models using public domain datasets and specifically developed apps. We find that the classifiers that achieve better performance for this task are support vector machines (SVM) and random forests (RF). We emphasize the use of feature selection (FS) techniques to reduce the data dimensionality and to identify the most relevant features in Android malware classification, leading to explainability on this task. Our approach can identify the most relevant features to classify an app as malware. Namely, we conclude that permissions play a prominent role in Android malware detection. The proposed approach reduces the data dimensionality while achieving high accuracy in identifying malware in Android apps. |
first_indexed | 2024-03-08T10:46:55Z |
format | Article |
id | doaj.art-1b679555612a4051a7e9d095d9d4da3d |
institution | Directory Open Access Journal |
issn | 2078-2489 |
language | English |
last_indexed | 2024-03-08T10:46:55Z |
publishDate | 2024-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Information |
spelling | doaj.art-1b679555612a4051a7e9d095d9d4da3d2024-01-26T17:03:42ZengMDPI AGInformation2078-24892024-01-011512510.3390/info15010025Explainable Machine Learning for Malware Detection on Android ApplicationsCatarina Palma0Artur Ferreira1Mário Figueiredo2ISEL, Instituto Superior de Engenharia de Lisboa, Instituto Politécnico de Lisboa, 1959-007 Lisboa, PortugalISEL, Instituto Superior de Engenharia de Lisboa, Instituto Politécnico de Lisboa, 1959-007 Lisboa, PortugalInstituto de Telecomunicações, 1049-001 Lisboa, PortugalThe presence of malicious software (malware), for example, in Android applications (apps), has harmful or irreparable consequences to the user and/or the device. Despite the protections app stores provide to avoid malware, it keeps growing in sophistication and diffusion. In this paper, we explore the use of machine learning (ML) techniques to detect malware in Android apps. The focus is on the study of different data pre-processing, dimensionality reduction, and classification techniques, assessing the generalization ability of the learned models using public domain datasets and specifically developed apps. We find that the classifiers that achieve better performance for this task are support vector machines (SVM) and random forests (RF). We emphasize the use of feature selection (FS) techniques to reduce the data dimensionality and to identify the most relevant features in Android malware classification, leading to explainability on this task. Our approach can identify the most relevant features to classify an app as malware. Namely, we conclude that permissions play a prominent role in Android malware detection. The proposed approach reduces the data dimensionality while achieving high accuracy in identifying malware in Android apps.https://www.mdpi.com/2078-2489/15/1/25android applicationsdatasetsexplainabilityfeature selectionmachine learningmalware detection |
spellingShingle | Catarina Palma Artur Ferreira Mário Figueiredo Explainable Machine Learning for Malware Detection on Android Applications Information android applications datasets explainability feature selection machine learning malware detection |
title | Explainable Machine Learning for Malware Detection on Android Applications |
title_full | Explainable Machine Learning for Malware Detection on Android Applications |
title_fullStr | Explainable Machine Learning for Malware Detection on Android Applications |
title_full_unstemmed | Explainable Machine Learning for Malware Detection on Android Applications |
title_short | Explainable Machine Learning for Malware Detection on Android Applications |
title_sort | explainable machine learning for malware detection on android applications |
topic | android applications datasets explainability feature selection machine learning malware detection |
url | https://www.mdpi.com/2078-2489/15/1/25 |
work_keys_str_mv | AT catarinapalma explainablemachinelearningformalwaredetectiononandroidapplications AT arturferreira explainablemachinelearningformalwaredetectiononandroidapplications AT mariofigueiredo explainablemachinelearningformalwaredetectiononandroidapplications |