Android Malware Detection Using Machine Learning with Feature Selection Based on the Genetic Algorithm

Since the discovery that machine learning can be used to effectively detect Android malware, many studies on machine learning-based malware detection techniques have been conducted. Several methods based on feature selection, particularly genetic algorithms, have been proposed to increase the perfor...

Full description

Bibliographic Details
Main Authors: Jaehyeong Lee, Hyuk Jang, Sungmin Ha, Yourim Yoon
Format: Article
Language:English
Published: MDPI AG 2021-11-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/9/21/2813
_version_ 1827678123660935168
author Jaehyeong Lee
Hyuk Jang
Sungmin Ha
Yourim Yoon
author_facet Jaehyeong Lee
Hyuk Jang
Sungmin Ha
Yourim Yoon
author_sort Jaehyeong Lee
collection DOAJ
description Since the discovery that machine learning can be used to effectively detect Android malware, many studies on machine learning-based malware detection techniques have been conducted. Several methods based on feature selection, particularly genetic algorithms, have been proposed to increase the performance and reduce costs. However, because they have yet to be compared with other methods and their many features have not been sufficiently verified, such methods have certain limitations. This study investigates whether genetic algorithm-based feature selection helps Android malware detection. We applied nine machine learning algorithms with genetic algorithm-based feature selection for 1104 static features through 5000 benign applications and 2500 malwares included in the Andro-AutoPsy dataset. Comparative experimental results show that the genetic algorithm performed better than the information gain-based method, which is generally used as a feature selection method. Moreover, machine learning using the proposed genetic algorithm-based feature selection has an absolute advantage in terms of time compared to machine learning without feature selection. The results indicate that incorporating genetic algorithms into Android malware detection is a valuable approach. Furthermore, to improve malware detection performance, it is useful to apply genetic algorithm-based feature selection to machine learning.
first_indexed 2024-03-10T05:57:12Z
format Article
id doaj.art-cc88efdfb8064a069f3b63b14def0b80
institution Directory Open Access Journal
issn 2227-7390
language English
last_indexed 2024-03-10T05:57:12Z
publishDate 2021-11-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj.art-cc88efdfb8064a069f3b63b14def0b802023-11-22T21:19:17ZengMDPI AGMathematics2227-73902021-11-01921281310.3390/math9212813Android Malware Detection Using Machine Learning with Feature Selection Based on the Genetic AlgorithmJaehyeong Lee0Hyuk Jang1Sungmin Ha2Yourim Yoon3Department of Computer Engineering, Gachon University, 1342 Seongnamdaero, Sujeong-gu, Seongnam-si 13120, Gyeonggi-do, KoreaDepartment of Computer Engineering, Gachon University, 1342 Seongnamdaero, Sujeong-gu, Seongnam-si 13120, Gyeonggi-do, KoreaDepartment of Business Administration, Gachon University, 1342 Seongnamdaero, Sujeong-gu, Seongnam-si 13120, Gyeonggi-do, KoreaDepartment of Computer Engineering, Gachon University, 1342 Seongnamdaero, Sujeong-gu, Seongnam-si 13120, Gyeonggi-do, KoreaSince the discovery that machine learning can be used to effectively detect Android malware, many studies on machine learning-based malware detection techniques have been conducted. Several methods based on feature selection, particularly genetic algorithms, have been proposed to increase the performance and reduce costs. However, because they have yet to be compared with other methods and their many features have not been sufficiently verified, such methods have certain limitations. This study investigates whether genetic algorithm-based feature selection helps Android malware detection. We applied nine machine learning algorithms with genetic algorithm-based feature selection for 1104 static features through 5000 benign applications and 2500 malwares included in the Andro-AutoPsy dataset. Comparative experimental results show that the genetic algorithm performed better than the information gain-based method, which is generally used as a feature selection method. Moreover, machine learning using the proposed genetic algorithm-based feature selection has an absolute advantage in terms of time compared to machine learning without feature selection. The results indicate that incorporating genetic algorithms into Android malware detection is a valuable approach. Furthermore, to improve malware detection performance, it is useful to apply genetic algorithm-based feature selection to machine learning.https://www.mdpi.com/2227-7390/9/21/2813android malware detectionmachine learninggenetic algorithmfeature selectionstatic analysis
spellingShingle Jaehyeong Lee
Hyuk Jang
Sungmin Ha
Yourim Yoon
Android Malware Detection Using Machine Learning with Feature Selection Based on the Genetic Algorithm
Mathematics
android malware detection
machine learning
genetic algorithm
feature selection
static analysis
title Android Malware Detection Using Machine Learning with Feature Selection Based on the Genetic Algorithm
title_full Android Malware Detection Using Machine Learning with Feature Selection Based on the Genetic Algorithm
title_fullStr Android Malware Detection Using Machine Learning with Feature Selection Based on the Genetic Algorithm
title_full_unstemmed Android Malware Detection Using Machine Learning with Feature Selection Based on the Genetic Algorithm
title_short Android Malware Detection Using Machine Learning with Feature Selection Based on the Genetic Algorithm
title_sort android malware detection using machine learning with feature selection based on the genetic algorithm
topic android malware detection
machine learning
genetic algorithm
feature selection
static analysis
url https://www.mdpi.com/2227-7390/9/21/2813
work_keys_str_mv AT jaehyeonglee androidmalwaredetectionusingmachinelearningwithfeatureselectionbasedonthegeneticalgorithm
AT hyukjang androidmalwaredetectionusingmachinelearningwithfeatureselectionbasedonthegeneticalgorithm
AT sungminha androidmalwaredetectionusingmachinelearningwithfeatureselectionbasedonthegeneticalgorithm
AT yourimyoon androidmalwaredetectionusingmachinelearningwithfeatureselectionbasedonthegeneticalgorithm