A score based malware classification approach for mobile forensic analysis

The rapid growth of Android as one of the leading Operating System (OS) for mobile devices drives the need of effective security measures to ensure the users have a safer platform to use. Boolean based features used for application permissions degrades the precision, recall, F-1 score and accuracy o...

Full description

Bibliographic Details
Main Author: Gobi, Ramyaa
Format: Thesis
Language:English
Published: 2019
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/83857/1/FSKTM%202019%2040%20-IR.pdf
Description
Summary:The rapid growth of Android as one of the leading Operating System (OS) for mobile devices drives the need of effective security measures to ensure the users have a safer platform to use. Boolean based features used for application permissions degrades the precision, recall, F-1 score and accuracy of malware detection. The reason for this is that Boolean based features classify the benign and malware applications based on true or false rule which is done based on the binary 0 for benign and 1 for malware. FAMOUS (Forensic Analysis of MObile devices Using Scoring of application permissions) which incorporates Effective Maliciousness Score of Permission (EMSP), a score based representation for permissions which replaces the Boolean representation for permissions have produced better result for the accuracy, precision, recall and F1-score over the Boolean based feature from existing works. FAMOUS is tested on the crawled datasets that are collected from multiple public archives such as Cantagio dump, AndroMalShare, Derbin project and Andrototal. This crawled datasets are then labelled by the result captured from Virus Total engines. Thus, FAMOUS did not use any standard dataset for its analysis. In his research, we will implement the EMSP, a score based triage and test it over Android Malware Dataset (AMD) and Android PRAGuard dataset to ensure reliable result obtained for the Accuracy, Precision, Recall and F1-Score through Machine Learning classifiers. Total of five classifiers have been used to train and test the datasets which consist of Random Forest, Decision Tree, Naive Bayes, K-nearest neighbours, and Support Vector Machine. EMSP will be implemented using Python programming language on Windows system. The performance metrics evaluated for the research are precision, recall, F-1 score and accuracy. The accuracy obtained varies for different classifiers for AMD and Android PRAGuard dataset. The best result obtained for Random Forest classifier when using AMD.