MalClassifier: Malware family classification using network flow sequence behaviour

Anti-malware vendors receive daily thousands of potentially malicious binaries to analyse and categorise before deploying the appropriate defence measure. Considering the limitations of existing malware analysis and classification methods, we present MalClassifier, a novel privacy-preserving system...

Full description

Bibliographic Details
Main Authors: AlAhmadi, B, Martinovic, I
Format: Conference item
Published: Institute of Electrical and Electronics Engineers 2018
_version_ 1797084083512147968
author AlAhmadi, B
Martinovic, I
author_facet AlAhmadi, B
Martinovic, I
author_sort AlAhmadi, B
collection OXFORD
description Anti-malware vendors receive daily thousands of potentially malicious binaries to analyse and categorise before deploying the appropriate defence measure. Considering the limitations of existing malware analysis and classification methods, we present MalClassifier, a novel privacy-preserving system for the automatic analysis and classification of malware using network flow sequence mining. MalClassifier allows identifying the malware family behind detected malicious network activity without requiring access to the infected host or malicious executable reducing overall response time. MalClassifier abstracts the malware families’ network flow sequence order and semantics behaviour as an n-flow. By mining and extracting the distinctive n-flows for each malware family, it automatically generates network flow sequence behaviour profiles. These profiles are used as features to build supervised machine learning classifiers (K-Nearest Neighbour and Random Forest) for malware family classification. We compute the degree of similarity between a flow sequence and the extracted profiles using a novel fuzzy similarity measure that computes the similarity between flows attributes and the similarity between the order of the flow sequences. For classifier performance evaluation, we use network traffic datasets of ransomware and botnets obtaining 96% F-measure for family classification. MalClassifier is resilient to malware evasion through flow sequence manipulation, maintaining the classifier’s high accuracy. Our results demonstrate that this type of network flow-level sequence analysis is highly effective in malware family classification, providing insights on reoccurring malware network flow patterns.
first_indexed 2024-03-07T01:50:26Z
format Conference item
id oxford-uuid:99e6212a-7e15-4a3c-a547-a8ad3936907d
institution University of Oxford
last_indexed 2024-03-07T01:50:26Z
publishDate 2018
publisher Institute of Electrical and Electronics Engineers
record_format dspace
spelling oxford-uuid:99e6212a-7e15-4a3c-a547-a8ad3936907d2022-03-27T00:17:43ZMalClassifier: Malware family classification using network flow sequence behaviourConference itemhttp://purl.org/coar/resource_type/c_5794uuid:99e6212a-7e15-4a3c-a547-a8ad3936907dSymplectic Elements at OxfordInstitute of Electrical and Electronics Engineers2018AlAhmadi, BMartinovic, IAnti-malware vendors receive daily thousands of potentially malicious binaries to analyse and categorise before deploying the appropriate defence measure. Considering the limitations of existing malware analysis and classification methods, we present MalClassifier, a novel privacy-preserving system for the automatic analysis and classification of malware using network flow sequence mining. MalClassifier allows identifying the malware family behind detected malicious network activity without requiring access to the infected host or malicious executable reducing overall response time. MalClassifier abstracts the malware families’ network flow sequence order and semantics behaviour as an n-flow. By mining and extracting the distinctive n-flows for each malware family, it automatically generates network flow sequence behaviour profiles. These profiles are used as features to build supervised machine learning classifiers (K-Nearest Neighbour and Random Forest) for malware family classification. We compute the degree of similarity between a flow sequence and the extracted profiles using a novel fuzzy similarity measure that computes the similarity between flows attributes and the similarity between the order of the flow sequences. For classifier performance evaluation, we use network traffic datasets of ransomware and botnets obtaining 96% F-measure for family classification. MalClassifier is resilient to malware evasion through flow sequence manipulation, maintaining the classifier’s high accuracy. Our results demonstrate that this type of network flow-level sequence analysis is highly effective in malware family classification, providing insights on reoccurring malware network flow patterns.
spellingShingle AlAhmadi, B
Martinovic, I
MalClassifier: Malware family classification using network flow sequence behaviour
title MalClassifier: Malware family classification using network flow sequence behaviour
title_full MalClassifier: Malware family classification using network flow sequence behaviour
title_fullStr MalClassifier: Malware family classification using network flow sequence behaviour
title_full_unstemmed MalClassifier: Malware family classification using network flow sequence behaviour
title_short MalClassifier: Malware family classification using network flow sequence behaviour
title_sort malclassifier malware family classification using network flow sequence behaviour
work_keys_str_mv AT alahmadib malclassifiermalwarefamilyclassificationusingnetworkflowsequencebehaviour
AT martinovici malclassifiermalwarefamilyclassificationusingnetworkflowsequencebehaviour