SK-Tree: a systematic malware detection algorithm on streaming trees via the signature kernel

The development of machine learning algorithms in the cyber security domain has been impeded by the complex, hierarchical, sequential and multimodal nature of the data involved. In this paper we introduce the notion of a streaming tree as a generic data structure encompassing a large portion of real...

Full description

Bibliographic Details
Main Authors: Cochrane, T, Foster, P, Chhabra, V, Lemercier, M, Lyons, T, Salvi, C
Format: Internet publication
Language:English
Published: 2021
_version_ 1826312495822798848
author Cochrane, T
Foster, P
Chhabra, V
Lemercier, M
Lyons, T
Salvi, C
author_facet Cochrane, T
Foster, P
Chhabra, V
Lemercier, M
Lyons, T
Salvi, C
author_sort Cochrane, T
collection OXFORD
description The development of machine learning algorithms in the cyber security domain has been impeded by the complex, hierarchical, sequential and multimodal nature of the data involved. In this paper we introduce the notion of a streaming tree as a generic data structure encompassing a large portion of real-world cyber security data. Starting from host-based event logs we represent computer processes as streaming trees that evolve in continuous time. Leveraging the properties of the signature kernel, a machine learning tool that recently emerged as a leading technology for learning with complex sequences of data, we develop the SK-Tree algorithm. SK-Tree is a supervised learning method for systematic malware detection on streaming trees that is robust to irregular sampling and high dimensionality of the underlying streams. We demonstrate the effectiveness of SK-Tree to detect malicious events on a portion of the publicly available DARPA OpTC dataset, achieving an AUROC score of 98%.
first_indexed 2024-04-09T03:55:26Z
format Internet publication
id oxford-uuid:db646fab-931c-493a-8b7b-657fd9752728
institution University of Oxford
language English
last_indexed 2024-04-09T03:55:26Z
publishDate 2021
record_format dspace
spelling oxford-uuid:db646fab-931c-493a-8b7b-657fd97527282024-03-07T16:20:57ZSK-Tree: a systematic malware detection algorithm on streaming trees via the signature kernelInternet publicationhttp://purl.org/coar/resource_type/c_7ad9uuid:db646fab-931c-493a-8b7b-657fd9752728EnglishSymplectic Elements2021Cochrane, TFoster, PChhabra, VLemercier, MLyons, TSalvi, CThe development of machine learning algorithms in the cyber security domain has been impeded by the complex, hierarchical, sequential and multimodal nature of the data involved. In this paper we introduce the notion of a streaming tree as a generic data structure encompassing a large portion of real-world cyber security data. Starting from host-based event logs we represent computer processes as streaming trees that evolve in continuous time. Leveraging the properties of the signature kernel, a machine learning tool that recently emerged as a leading technology for learning with complex sequences of data, we develop the SK-Tree algorithm. SK-Tree is a supervised learning method for systematic malware detection on streaming trees that is robust to irregular sampling and high dimensionality of the underlying streams. We demonstrate the effectiveness of SK-Tree to detect malicious events on a portion of the publicly available DARPA OpTC dataset, achieving an AUROC score of 98%.
spellingShingle Cochrane, T
Foster, P
Chhabra, V
Lemercier, M
Lyons, T
Salvi, C
SK-Tree: a systematic malware detection algorithm on streaming trees via the signature kernel
title SK-Tree: a systematic malware detection algorithm on streaming trees via the signature kernel
title_full SK-Tree: a systematic malware detection algorithm on streaming trees via the signature kernel
title_fullStr SK-Tree: a systematic malware detection algorithm on streaming trees via the signature kernel
title_full_unstemmed SK-Tree: a systematic malware detection algorithm on streaming trees via the signature kernel
title_short SK-Tree: a systematic malware detection algorithm on streaming trees via the signature kernel
title_sort sk tree a systematic malware detection algorithm on streaming trees via the signature kernel
work_keys_str_mv AT cochranet sktreeasystematicmalwaredetectionalgorithmonstreamingtreesviathesignaturekernel
AT fosterp sktreeasystematicmalwaredetectionalgorithmonstreamingtreesviathesignaturekernel
AT chhabrav sktreeasystematicmalwaredetectionalgorithmonstreamingtreesviathesignaturekernel
AT lemercierm sktreeasystematicmalwaredetectionalgorithmonstreamingtreesviathesignaturekernel
AT lyonst sktreeasystematicmalwaredetectionalgorithmonstreamingtreesviathesignaturekernel
AT salvic sktreeasystematicmalwaredetectionalgorithmonstreamingtreesviathesignaturekernel