Explainable Machine Learning for Malware Detection on Android Applications

The presence of malicious software (malware), for example, in Android applications (apps), has harmful or irreparable consequences to the user and/or the device. Despite the protections app stores provide to avoid malware, it keeps growing in sophistication and diffusion. In this paper, we explore t...

Full description

Bibliographic Details
Main Authors: Catarina Palma, Artur Ferreira, Mário Figueiredo
Format: Article
Language:English
Published: MDPI AG 2024-01-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/15/1/25
_version_ 1797343391721193472
author Catarina Palma
Artur Ferreira
Mário Figueiredo
author_facet Catarina Palma
Artur Ferreira
Mário Figueiredo
author_sort Catarina Palma
collection DOAJ
description The presence of malicious software (malware), for example, in Android applications (apps), has harmful or irreparable consequences to the user and/or the device. Despite the protections app stores provide to avoid malware, it keeps growing in sophistication and diffusion. In this paper, we explore the use of machine learning (ML) techniques to detect malware in Android apps. The focus is on the study of different data pre-processing, dimensionality reduction, and classification techniques, assessing the generalization ability of the learned models using public domain datasets and specifically developed apps. We find that the classifiers that achieve better performance for this task are support vector machines (SVM) and random forests (RF). We emphasize the use of feature selection (FS) techniques to reduce the data dimensionality and to identify the most relevant features in Android malware classification, leading to explainability on this task. Our approach can identify the most relevant features to classify an app as malware. Namely, we conclude that permissions play a prominent role in Android malware detection. The proposed approach reduces the data dimensionality while achieving high accuracy in identifying malware in Android apps.
first_indexed 2024-03-08T10:46:55Z
format Article
id doaj.art-1b679555612a4051a7e9d095d9d4da3d
institution Directory Open Access Journal
issn 2078-2489
language English
last_indexed 2024-03-08T10:46:55Z
publishDate 2024-01-01
publisher MDPI AG
record_format Article
series Information
spelling doaj.art-1b679555612a4051a7e9d095d9d4da3d2024-01-26T17:03:42ZengMDPI AGInformation2078-24892024-01-011512510.3390/info15010025Explainable Machine Learning for Malware Detection on Android ApplicationsCatarina Palma0Artur Ferreira1Mário Figueiredo2ISEL, Instituto Superior de Engenharia de Lisboa, Instituto Politécnico de Lisboa, 1959-007 Lisboa, PortugalISEL, Instituto Superior de Engenharia de Lisboa, Instituto Politécnico de Lisboa, 1959-007 Lisboa, PortugalInstituto de Telecomunicações, 1049-001 Lisboa, PortugalThe presence of malicious software (malware), for example, in Android applications (apps), has harmful or irreparable consequences to the user and/or the device. Despite the protections app stores provide to avoid malware, it keeps growing in sophistication and diffusion. In this paper, we explore the use of machine learning (ML) techniques to detect malware in Android apps. The focus is on the study of different data pre-processing, dimensionality reduction, and classification techniques, assessing the generalization ability of the learned models using public domain datasets and specifically developed apps. We find that the classifiers that achieve better performance for this task are support vector machines (SVM) and random forests (RF). We emphasize the use of feature selection (FS) techniques to reduce the data dimensionality and to identify the most relevant features in Android malware classification, leading to explainability on this task. Our approach can identify the most relevant features to classify an app as malware. Namely, we conclude that permissions play a prominent role in Android malware detection. The proposed approach reduces the data dimensionality while achieving high accuracy in identifying malware in Android apps.https://www.mdpi.com/2078-2489/15/1/25android applicationsdatasetsexplainabilityfeature selectionmachine learningmalware detection
spellingShingle Catarina Palma
Artur Ferreira
Mário Figueiredo
Explainable Machine Learning for Malware Detection on Android Applications
Information
android applications
datasets
explainability
feature selection
machine learning
malware detection
title Explainable Machine Learning for Malware Detection on Android Applications
title_full Explainable Machine Learning for Malware Detection on Android Applications
title_fullStr Explainable Machine Learning for Malware Detection on Android Applications
title_full_unstemmed Explainable Machine Learning for Malware Detection on Android Applications
title_short Explainable Machine Learning for Malware Detection on Android Applications
title_sort explainable machine learning for malware detection on android applications
topic android applications
datasets
explainability
feature selection
machine learning
malware detection
url https://www.mdpi.com/2078-2489/15/1/25
work_keys_str_mv AT catarinapalma explainablemachinelearningformalwaredetectiononandroidapplications
AT arturferreira explainablemachinelearningformalwaredetectiononandroidapplications
AT mariofigueiredo explainablemachinelearningformalwaredetectiononandroidapplications