Study on Malware Classification Based on N-Gram Static Analysis Technology

In order to solve the problem of low accuracy of malware classification,this paper proposes a research on malware classification based on N-Gram static analysis technology.Firstly,the N-Gram method is used to extract the byte sequence of length 2 from the malware samples.Secondly,according to the ex...

Full description

Bibliographic Details
Main Author:	ZHANG Guang-hua, GAO Tian-jiao, CHEN Zhen-guo, YU Nai-wen
Format:	Article
Language:	zho
Published:	Editorial office of Computer Science 2022-08-01
Series:	Jisuanji kexue
Subjects:	n-gram\|static analysis\|machine learning\|malware
Online Access:	https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-8-336.pdf

Description
Summary:	In order to solve the problem of low accuracy of malware classification,this paper proposes a research on malware classification based on N-Gram static analysis technology.Firstly,the N-Gram method is used to extract the byte sequence of length 2 from the malware samples.Secondly,according to the extracted features,KNN,logistic regression,random forest and XGBoost are used to train the malware classification model based on machine learning.Thirdly,the confusion matrix and logarithmic loss function are used to evaluate the malware classification model.Finally,the malware classification model is trained and tested in the Kaggle malware data set.Experimental results show that the accuracy rates of the malware classification models of XGBoost and random forest reach 98.43% and 97.93%,and the Log Loss values are 0.022240 and 0.026946,respectively.Compared with the existing methods,the proposed method can classify malware more accurately and protect computer system from malware attack.
ISSN:	1002-137X

Study on Malware Classification Based on N-Gram Static Analysis Technology

Similar Items