Summary: | Due to the openness of an Android system, many Internet of Things (IoT) devices are running the Android system and Android devices have become a common control terminal for IoT devices because of various sensors on them. With the popularity of IoT devices, malware on Android-based IoT devices is also increasing. People’s lives and privacy security are threatened. To reduce such threat, many researchers have proposed new methods to detect Android malware. Currently, most malware detection products on the market are based on malware signatures, which have a fast detection speed and normally a low false alarm rate for known malware families. However, they cannot detect unknown malware and are easily evaded by malware that is confused or packaged. Many new solutions use syntactic features and machine learning techniques to classify Android malware. It has been known that analysis of the Function Call Graph (FCG) can capture behavioral features of malware well. This paper presents a new approach to classifying Android malware based on deep learning and OpCode-level FCG. The FCG is obtained through static analysis of Operation Code (OpCode), and the deep learning model we used is the Long Short-Term Memory (LSTM). We conducted experiments on a dataset with 1796 Android malware samples classified into two categories (obtained from Virusshare and AndroZoo) and 1000 benign Android apps. Our experimental results showed that our proposed approach with an accuracy of <inline-formula> <math display="inline"> <semantics> <mrow> <mn>97</mn> <mo>%</mo> </mrow> </semantics> </math> </inline-formula> outperforms the state-of-the-art methods such as those proposed by Nikola et al. and Hou et al. (IJCAI-18) with the accuracy of <inline-formula> <math display="inline"> <semantics> <mrow> <mn>97</mn> <mo>%</mo> </mrow> </semantics> </math> </inline-formula> and <inline-formula> <math display="inline"> <semantics> <mrow> <mn>91</mn> <mo>%</mo> </mrow> </semantics> </math> </inline-formula>, respectively. The time consumption of our proposed approach is less than the other two methods.
|