Summary: | Machine learning based malware detection has been proved great success in the past few years. Most of the conventional methods are based on supervised learning, which relies on static features with labels. While selecting static features requires both human expertise and labor. New selections, which fix features from a wide range, are handcrafted by careful manual experimentation or modified from existing methods. Despite their success, the static features are still hard to be determined. In this paper, a Deep Q-learning based Feature Selection Architecture (DQFSA) is introduced to cover the deficiencies of traditional methods. The proposed architecture automatically selects a small set of highly differentiated features for malware detection task without human intervention. DQFSA trains an agent through Q-learning to maximize the expected accuracy of the classifiers on a validation dataset by sequentially interacting with the features space. The agent, based on an e-greedy exploration strategy and experience replay, explores a large but finite space of possible actions and iteratively discovers selections with improved performance on the learning task. Actions are a set of reasonable choices, which indicate whether a feature is chosen or not. Extensive experimental results indicate that the proposed DQFSA outperforms existing baseline approaches for feature selection on malware detection with minimum features, improves the generalization performance of the learning model and reduces human intervention. More specifically, the proposed architecture's underlying representation is robust enough for re-calibrating models to other domains of information security.
|