MalClassifier: Malware family classification using network flow sequence behaviour
Anti-malware vendors receive daily thousands of potentially malicious binaries to analyse and categorise before deploying the appropriate defence measure. Considering the limitations of existing malware analysis and classification methods, we present MalClassifier, a novel privacy-preserving system...
Main Authors: | , |
---|---|
Format: | Conference item |
Published: |
Institute of Electrical and Electronics Engineers
2018
|
_version_ | 1797084083512147968 |
---|---|
author | AlAhmadi, B Martinovic, I |
author_facet | AlAhmadi, B Martinovic, I |
author_sort | AlAhmadi, B |
collection | OXFORD |
description | Anti-malware vendors receive daily thousands of potentially malicious binaries to analyse and categorise before deploying the appropriate defence measure. Considering the limitations of existing malware analysis and classification methods, we present MalClassifier, a novel privacy-preserving system for the automatic analysis and classification of malware using network flow sequence mining. MalClassifier allows identifying the malware family behind detected malicious network activity without requiring access to the infected host or malicious executable reducing overall response time. MalClassifier abstracts the malware families’ network flow sequence order and semantics behaviour as an n-flow. By mining and extracting the distinctive n-flows for each malware family, it automatically generates network flow sequence behaviour profiles. These profiles are used as features to build supervised machine learning classifiers (K-Nearest Neighbour and Random Forest) for malware family classification. We compute the degree of similarity between a flow sequence and the extracted profiles using a novel fuzzy similarity measure that computes the similarity between flows attributes and the similarity between the order of the flow sequences. For classifier performance evaluation, we use network traffic datasets of ransomware and botnets obtaining 96% F-measure for family classification. MalClassifier is resilient to malware evasion through flow sequence manipulation, maintaining the classifier’s high accuracy. Our results demonstrate that this type of network flow-level sequence analysis is highly effective in malware family classification, providing insights on reoccurring malware network flow patterns. |
first_indexed | 2024-03-07T01:50:26Z |
format | Conference item |
id | oxford-uuid:99e6212a-7e15-4a3c-a547-a8ad3936907d |
institution | University of Oxford |
last_indexed | 2024-03-07T01:50:26Z |
publishDate | 2018 |
publisher | Institute of Electrical and Electronics Engineers |
record_format | dspace |
spelling | oxford-uuid:99e6212a-7e15-4a3c-a547-a8ad3936907d2022-03-27T00:17:43ZMalClassifier: Malware family classification using network flow sequence behaviourConference itemhttp://purl.org/coar/resource_type/c_5794uuid:99e6212a-7e15-4a3c-a547-a8ad3936907dSymplectic Elements at OxfordInstitute of Electrical and Electronics Engineers2018AlAhmadi, BMartinovic, IAnti-malware vendors receive daily thousands of potentially malicious binaries to analyse and categorise before deploying the appropriate defence measure. Considering the limitations of existing malware analysis and classification methods, we present MalClassifier, a novel privacy-preserving system for the automatic analysis and classification of malware using network flow sequence mining. MalClassifier allows identifying the malware family behind detected malicious network activity without requiring access to the infected host or malicious executable reducing overall response time. MalClassifier abstracts the malware families’ network flow sequence order and semantics behaviour as an n-flow. By mining and extracting the distinctive n-flows for each malware family, it automatically generates network flow sequence behaviour profiles. These profiles are used as features to build supervised machine learning classifiers (K-Nearest Neighbour and Random Forest) for malware family classification. We compute the degree of similarity between a flow sequence and the extracted profiles using a novel fuzzy similarity measure that computes the similarity between flows attributes and the similarity between the order of the flow sequences. For classifier performance evaluation, we use network traffic datasets of ransomware and botnets obtaining 96% F-measure for family classification. MalClassifier is resilient to malware evasion through flow sequence manipulation, maintaining the classifier’s high accuracy. Our results demonstrate that this type of network flow-level sequence analysis is highly effective in malware family classification, providing insights on reoccurring malware network flow patterns. |
spellingShingle | AlAhmadi, B Martinovic, I MalClassifier: Malware family classification using network flow sequence behaviour |
title | MalClassifier: Malware family classification using network flow sequence behaviour |
title_full | MalClassifier: Malware family classification using network flow sequence behaviour |
title_fullStr | MalClassifier: Malware family classification using network flow sequence behaviour |
title_full_unstemmed | MalClassifier: Malware family classification using network flow sequence behaviour |
title_short | MalClassifier: Malware family classification using network flow sequence behaviour |
title_sort | malclassifier malware family classification using network flow sequence behaviour |
work_keys_str_mv | AT alahmadib malclassifiermalwarefamilyclassificationusingnetworkflowsequencebehaviour AT martinovici malclassifiermalwarefamilyclassificationusingnetworkflowsequencebehaviour |