The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models
Several supervised machine learning models have been proposed and used to detect Android ransomware. These models were trained using different datasets from different sources. However, the age of the ransomware datasets was not considered when training and testing these models. Therefore, the detect...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-03-01
|
Series: | Information |
Subjects: | |
Online Access: | https://www.mdpi.com/2078-2489/14/3/193 |
_version_ | 1797611109219303424 |
---|---|
author | Qussai M. Yaseen |
author_facet | Qussai M. Yaseen |
author_sort | Qussai M. Yaseen |
collection | DOAJ |
description | Several supervised machine learning models have been proposed and used to detect Android ransomware. These models were trained using different datasets from different sources. However, the age of the ransomware datasets was not considered when training and testing these models. Therefore, the detection accuracy for those models is inaccurate since they learned using features from specific ransomware, old or new ransomware, and they did not learn using diverse ransomware features from different ages. This paper sheds light on the importance of considering the age of ransomware datasets and its effects on the detection accuracy of supervised machine learning models. This proves that supervised machine learning models trained using new ransomware dataset are inefficient in detecting old types of ransomware and vice versa. Moreover, this paper collected a large and diverse dataset of ransomware applications that comprises new and old ransomware developed during the period 2008–2020. Furthermore, the paper proposes a supervised machine learning model that is trained and tested using the diverse dataset. The experiments show that the proposed model is efficient in detecting Android ransomware regardless of its age by achieving an accuracy of approximately 97.48%. Moreover, the results shows that the proposed model outperforms the state-of-the-art approaches considered in this work. |
first_indexed | 2024-03-11T06:24:11Z |
format | Article |
id | doaj.art-8ce53b8497764982828ff767f69abd23 |
institution | Directory Open Access Journal |
issn | 2078-2489 |
language | English |
last_indexed | 2024-03-11T06:24:11Z |
publishDate | 2023-03-01 |
publisher | MDPI AG |
record_format | Article |
series | Information |
spelling | doaj.art-8ce53b8497764982828ff767f69abd232023-11-17T11:44:23ZengMDPI AGInformation2078-24892023-03-0114319310.3390/info14030193The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning ModelsQussai M. Yaseen0Artificial Intelligence Research Center (AIRC), College of Engineering and Information Technology, Ajman University, Ajman 20550, United Arab EmiratesSeveral supervised machine learning models have been proposed and used to detect Android ransomware. These models were trained using different datasets from different sources. However, the age of the ransomware datasets was not considered when training and testing these models. Therefore, the detection accuracy for those models is inaccurate since they learned using features from specific ransomware, old or new ransomware, and they did not learn using diverse ransomware features from different ages. This paper sheds light on the importance of considering the age of ransomware datasets and its effects on the detection accuracy of supervised machine learning models. This proves that supervised machine learning models trained using new ransomware dataset are inefficient in detecting old types of ransomware and vice versa. Moreover, this paper collected a large and diverse dataset of ransomware applications that comprises new and old ransomware developed during the period 2008–2020. Furthermore, the paper proposes a supervised machine learning model that is trained and tested using the diverse dataset. The experiments show that the proposed model is efficient in detecting Android ransomware regardless of its age by achieving an accuracy of approximately 97.48%. Moreover, the results shows that the proposed model outperforms the state-of-the-art approaches considered in this work.https://www.mdpi.com/2078-2489/14/3/193Android malwareinformation securitysupervised machine learningransomware |
spellingShingle | Qussai M. Yaseen The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models Information Android malware information security supervised machine learning ransomware |
title | The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models |
title_full | The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models |
title_fullStr | The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models |
title_full_unstemmed | The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models |
title_short | The Effect of the Ransomware Dataset Age on the Detection Accuracy of Machine Learning Models |
title_sort | effect of the ransomware dataset age on the detection accuracy of machine learning models |
topic | Android malware information security supervised machine learning ransomware |
url | https://www.mdpi.com/2078-2489/14/3/193 |
work_keys_str_mv | AT qussaimyaseen theeffectoftheransomwaredatasetageonthedetectionaccuracyofmachinelearningmodels AT qussaimyaseen effectoftheransomwaredatasetageonthedetectionaccuracyofmachinelearningmodels |