Intra- and inter-sector contextual information fusion with joint self-attention for file fragment classification

File fragment classification (FFC) aims to identify the file type of file fragments in memory sectors, which is of great importance in memory forensics and information security. Existing works focused on processing the bytes within sectors separately and ignoring contextual information between adjac...

Full description

Bibliographic Details
Main Authors: Wang, Yi, Liu, Wenyang, Wu, Kejun, Yap, Kim-Hui, Chau, Lap-Pui
Other Authors: School of Electrical and Electronic Engineering
Format: Journal Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/174537
_version_ 1811679973350572032
author Wang, Yi
Liu, Wenyang
Wu, Kejun
Yap, Kim-Hui
Chau, Lap-Pui
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Wang, Yi
Liu, Wenyang
Wu, Kejun
Yap, Kim-Hui
Chau, Lap-Pui
author_sort Wang, Yi
collection NTU
description File fragment classification (FFC) aims to identify the file type of file fragments in memory sectors, which is of great importance in memory forensics and information security. Existing works focused on processing the bytes within sectors separately and ignoring contextual information between adjacent sectors. In this paper, we introduce a joint self-attention network (JSANet) for FFC to learn intra-sector local features and inter-sector contextual features. Specifically, we propose an end-to-end network with the byte, channel, and sector self-attention modules. Byte self-attention adaptively recognizes the intra-sector significant bytes, and channel self-attention re-calibrates the features between channels. Based on the insight that adjacent memory sectors are most likely to store a file fragment, sector self-attention leverages contextual information in neighboring sectors to enhance inter-sector feature representation. Extensive experiments on seven FFC benchmarks show the superiority of our method compared with state-of-the-art methods. Moreover, we construct VFF-16, a variable-length file fragment dataset to reflect file fragmentation. Integrated with sector self-attention, our method improves accuracy by more than 16.3% against the baseline on VFF-16, and the runtime achieves 5.1 s/GB with GPU acceleration. In addition, we extend our model to malware detection and show its applicability.
first_indexed 2024-10-01T03:17:40Z
format Journal Article
id ntu-10356/174537
institution Nanyang Technological University
language English
last_indexed 2024-10-01T03:17:40Z
publishDate 2024
record_format dspace
spelling ntu-10356/1745372024-04-05T15:41:09Z Intra- and inter-sector contextual information fusion with joint self-attention for file fragment classification Wang, Yi Liu, Wenyang Wu, Kejun Yap, Kim-Hui Chau, Lap-Pui School of Electrical and Electronic Engineering Computer and Information Science File fragment classification File carving Memory forensics Contextual information Self-attention File fragment classification (FFC) aims to identify the file type of file fragments in memory sectors, which is of great importance in memory forensics and information security. Existing works focused on processing the bytes within sectors separately and ignoring contextual information between adjacent sectors. In this paper, we introduce a joint self-attention network (JSANet) for FFC to learn intra-sector local features and inter-sector contextual features. Specifically, we propose an end-to-end network with the byte, channel, and sector self-attention modules. Byte self-attention adaptively recognizes the intra-sector significant bytes, and channel self-attention re-calibrates the features between channels. Based on the insight that adjacent memory sectors are most likely to store a file fragment, sector self-attention leverages contextual information in neighboring sectors to enhance inter-sector feature representation. Extensive experiments on seven FFC benchmarks show the superiority of our method compared with state-of-the-art methods. Moreover, we construct VFF-16, a variable-length file fragment dataset to reflect file fragmentation. Integrated with sector self-attention, our method improves accuracy by more than 16.3% against the baseline on VFF-16, and the runtime achieves 5.1 s/GB with GPU acceleration. In addition, we extend our model to malware detection and show its applicability. National Research Foundation (NRF) Submitted/Accepted version This research is supported by the National Research Foundation, Singapore, and Cyber Security Agency of Singapore under its National Cybersecurity Research & Development Programme (Cyber-Hardware Forensic & Assurance Evaluation R&D Programme <NRF2018NCRN CR009-0001>). 2024-04-02T06:51:33Z 2024-04-02T06:51:33Z 2024 Journal Article Wang, Y., Liu, W., Wu, K., Yap, K. & Chau, L. (2024). Intra- and inter-sector contextual information fusion with joint self-attention for file fragment classification. Knowledge-Based Systems, 291, 111565-. https://dx.doi.org/10.1016/j.knosys.2024.111565 0950-7051 https://hdl.handle.net/10356/174537 10.1016/j.knosys.2024.111565 2-s2.0-85186678600 291 111565 en NRF2018NCRNCR009-0001 Knowledge-Based Systems © 2024 Elsevier B.V. All rights reserved. This article may be downloaded for personal use only. Any other use requires prior permission of the copyright holder. The Version of Record is available online at http://doi.org/10.1016/j.knosys.2024.111565. application/pdf
spellingShingle Computer and Information Science
File fragment classification
File carving
Memory forensics
Contextual information
Self-attention
Wang, Yi
Liu, Wenyang
Wu, Kejun
Yap, Kim-Hui
Chau, Lap-Pui
Intra- and inter-sector contextual information fusion with joint self-attention for file fragment classification
title Intra- and inter-sector contextual information fusion with joint self-attention for file fragment classification
title_full Intra- and inter-sector contextual information fusion with joint self-attention for file fragment classification
title_fullStr Intra- and inter-sector contextual information fusion with joint self-attention for file fragment classification
title_full_unstemmed Intra- and inter-sector contextual information fusion with joint self-attention for file fragment classification
title_short Intra- and inter-sector contextual information fusion with joint self-attention for file fragment classification
title_sort intra and inter sector contextual information fusion with joint self attention for file fragment classification
topic Computer and Information Science
File fragment classification
File carving
Memory forensics
Contextual information
Self-attention
url https://hdl.handle.net/10356/174537
work_keys_str_mv AT wangyi intraandintersectorcontextualinformationfusionwithjointselfattentionforfilefragmentclassification
AT liuwenyang intraandintersectorcontextualinformationfusionwithjointselfattentionforfilefragmentclassification
AT wukejun intraandintersectorcontextualinformationfusionwithjointselfattentionforfilefragmentclassification
AT yapkimhui intraandintersectorcontextualinformationfusionwithjointselfattentionforfilefragmentclassification
AT chaulappui intraandintersectorcontextualinformationfusionwithjointselfattentionforfilefragmentclassification