Feature Selection for Malware Detection Based on Reinforcement Learning

Machine learning based malware detection has been proved great success in the past few years. Most of the conventional methods are based on supervised learning, which relies on static features with labels. While selecting static features requires both human expertise and labor. New selections, which...

Full description

Bibliographic Details
Main Authors: Zhiyang Fang, Junfeng Wang, Jiaxuan Geng, Xuan Kan
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8920059/
_version_ 1819133553458282496
author Zhiyang Fang
Junfeng Wang
Jiaxuan Geng
Xuan Kan
author_facet Zhiyang Fang
Junfeng Wang
Jiaxuan Geng
Xuan Kan
author_sort Zhiyang Fang
collection DOAJ
description Machine learning based malware detection has been proved great success in the past few years. Most of the conventional methods are based on supervised learning, which relies on static features with labels. While selecting static features requires both human expertise and labor. New selections, which fix features from a wide range, are handcrafted by careful manual experimentation or modified from existing methods. Despite their success, the static features are still hard to be determined. In this paper, a Deep Q-learning based Feature Selection Architecture (DQFSA) is introduced to cover the deficiencies of traditional methods. The proposed architecture automatically selects a small set of highly differentiated features for malware detection task without human intervention. DQFSA trains an agent through Q-learning to maximize the expected accuracy of the classifiers on a validation dataset by sequentially interacting with the features space. The agent, based on an e-greedy exploration strategy and experience replay, explores a large but finite space of possible actions and iteratively discovers selections with improved performance on the learning task. Actions are a set of reasonable choices, which indicate whether a feature is chosen or not. Extensive experimental results indicate that the proposed DQFSA outperforms existing baseline approaches for feature selection on malware detection with minimum features, improves the generalization performance of the learning model and reduces human intervention. More specifically, the proposed architecture's underlying representation is robust enough for re-calibrating models to other domains of information security.
first_indexed 2024-12-22T09:49:08Z
format Article
id doaj.art-d59363ab1db5496bad8c232910310df4
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-22T09:49:08Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-d59363ab1db5496bad8c232910310df42022-12-21T18:30:27ZengIEEEIEEE Access2169-35362019-01-01717617717618710.1109/ACCESS.2019.29574298920059Feature Selection for Malware Detection Based on Reinforcement LearningZhiyang Fang0https://orcid.org/0000-0001-7550-3970Junfeng Wang1https://orcid.org/0000-0003-4289-8106Jiaxuan Geng2Xuan Kan3College of Computer Science, Sichuan University, Chengdu, ChinaCollege of Computer Science, Sichuan University, Chengdu, ChinaCollege of Computer Science, Sichuan University, Chengdu, ChinaCollege of Computer Science, Sichuan University, Chengdu, ChinaMachine learning based malware detection has been proved great success in the past few years. Most of the conventional methods are based on supervised learning, which relies on static features with labels. While selecting static features requires both human expertise and labor. New selections, which fix features from a wide range, are handcrafted by careful manual experimentation or modified from existing methods. Despite their success, the static features are still hard to be determined. In this paper, a Deep Q-learning based Feature Selection Architecture (DQFSA) is introduced to cover the deficiencies of traditional methods. The proposed architecture automatically selects a small set of highly differentiated features for malware detection task without human intervention. DQFSA trains an agent through Q-learning to maximize the expected accuracy of the classifiers on a validation dataset by sequentially interacting with the features space. The agent, based on an e-greedy exploration strategy and experience replay, explores a large but finite space of possible actions and iteratively discovers selections with improved performance on the learning task. Actions are a set of reasonable choices, which indicate whether a feature is chosen or not. Extensive experimental results indicate that the proposed DQFSA outperforms existing baseline approaches for feature selection on malware detection with minimum features, improves the generalization performance of the learning model and reduces human intervention. More specifically, the proposed architecture's underlying representation is robust enough for re-calibrating models to other domains of information security.https://ieeexplore.ieee.org/document/8920059/Feature selectionmalware detectiondeep reinforcement learningQ-learning
spellingShingle Zhiyang Fang
Junfeng Wang
Jiaxuan Geng
Xuan Kan
Feature Selection for Malware Detection Based on Reinforcement Learning
IEEE Access
Feature selection
malware detection
deep reinforcement learning
Q-learning
title Feature Selection for Malware Detection Based on Reinforcement Learning
title_full Feature Selection for Malware Detection Based on Reinforcement Learning
title_fullStr Feature Selection for Malware Detection Based on Reinforcement Learning
title_full_unstemmed Feature Selection for Malware Detection Based on Reinforcement Learning
title_short Feature Selection for Malware Detection Based on Reinforcement Learning
title_sort feature selection for malware detection based on reinforcement learning
topic Feature selection
malware detection
deep reinforcement learning
Q-learning
url https://ieeexplore.ieee.org/document/8920059/
work_keys_str_mv AT zhiyangfang featureselectionformalwaredetectionbasedonreinforcementlearning
AT junfengwang featureselectionformalwaredetectionbasedonreinforcementlearning
AT jiaxuangeng featureselectionformalwaredetectionbasedonreinforcementlearning
AT xuankan featureselectionformalwaredetectionbasedonreinforcementlearning