The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models

Several supervised machine learning models have been proposed and used to detect Android ransomware. These models were trained using different datasets from different sources. However, the age of the ransomware datasets was not considered when training and testing these models. Therefore, the detect...

Full description

Bibliographic Details
Main Author: Qussai M. Yaseen
Format: Article
Language:English
Published: MDPI AG 2023-03-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/14/3/193
_version_ 1797611109219303424
author Qussai M. Yaseen
author_facet Qussai M. Yaseen
author_sort Qussai M. Yaseen
collection DOAJ
description Several supervised machine learning models have been proposed and used to detect Android ransomware. These models were trained using different datasets from different sources. However, the age of the ransomware datasets was not considered when training and testing these models. Therefore, the detection accuracy for those models is inaccurate since they learned using features from specific ransomware, old or new ransomware, and they did not learn using diverse ransomware features from different ages. This paper sheds light on the importance of considering the age of ransomware datasets and its effects on the detection accuracy of supervised machine learning models. This proves that supervised machine learning models trained using new ransomware dataset are inefficient in detecting old types of ransomware and vice versa. Moreover, this paper collected a large and diverse dataset of ransomware applications that comprises new and old ransomware developed during the period 2008–2020. Furthermore, the paper proposes a supervised machine learning model that is trained and tested using the diverse dataset. The experiments show that the proposed model is efficient in detecting Android ransomware regardless of its age by achieving an accuracy of approximately 97.48%. Moreover, the results shows that the proposed model outperforms the state-of-the-art approaches considered in this work.
first_indexed 2024-03-11T06:24:11Z
format Article
id doaj.art-8ce53b8497764982828ff767f69abd23
institution Directory Open Access Journal
issn 2078-2489
language English
last_indexed 2024-03-11T06:24:11Z
publishDate 2023-03-01
publisher MDPI AG
record_format Article
series Information
spelling doaj.art-8ce53b8497764982828ff767f69abd232023-11-17T11:44:23ZengMDPI AGInformation2078-24892023-03-0114319310.3390/info14030193The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning ModelsQussai M. Yaseen0Artificial Intelligence Research Center (AIRC), College of Engineering and Information Technology, Ajman University, Ajman 20550, United Arab EmiratesSeveral supervised machine learning models have been proposed and used to detect Android ransomware. These models were trained using different datasets from different sources. However, the age of the ransomware datasets was not considered when training and testing these models. Therefore, the detection accuracy for those models is inaccurate since they learned using features from specific ransomware, old or new ransomware, and they did not learn using diverse ransomware features from different ages. This paper sheds light on the importance of considering the age of ransomware datasets and its effects on the detection accuracy of supervised machine learning models. This proves that supervised machine learning models trained using new ransomware dataset are inefficient in detecting old types of ransomware and vice versa. Moreover, this paper collected a large and diverse dataset of ransomware applications that comprises new and old ransomware developed during the period 2008–2020. Furthermore, the paper proposes a supervised machine learning model that is trained and tested using the diverse dataset. The experiments show that the proposed model is efficient in detecting Android ransomware regardless of its age by achieving an accuracy of approximately 97.48%. Moreover, the results shows that the proposed model outperforms the state-of-the-art approaches considered in this work.https://www.mdpi.com/2078-2489/14/3/193Android malwareinformation securitysupervised machine learningransomware
spellingShingle Qussai M. Yaseen
The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models
Information
Android malware
information security
supervised machine learning
ransomware
title The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models
title_full The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models
title_fullStr The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models
title_full_unstemmed The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models
title_short The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models
title_sort effect of the ransomware dataset age on the detection accuracy of machine learning models
topic Android malware
information security
supervised machine learning
ransomware
url https://www.mdpi.com/2078-2489/14/3/193
work_keys_str_mv AT qussaimyaseen theeffectoftheransomwaredatasetageonthedetectionaccuracyofmachinelearningmodels
AT qussaimyaseen effectoftheransomwaredatasetageonthedetectionaccuracyofmachinelearningmodels