Malware detection in memory images using machine learning

With the increasing prevalence and sophistication of malware, there is an urgent need for effective and efficient methods to detect them. Memory forensics has shown promising results in finding malware that can elude traditional security measures. At the same time, machine learning techniques have p...

ver descrição completa

Detalhes bibliográficos
Autor principal: Neo, Guat Kwan
Outros Autores: Luo Jun
Formato: Final Year Project (FYP)
Idioma:English
Publicado em: Nanyang Technological University 2023
Assuntos:
Acesso em linha:https://hdl.handle.net/10356/165974
_version_ 1826128287291670528
author Neo, Guat Kwan
author2 Luo Jun
author_facet Luo Jun
Neo, Guat Kwan
author_sort Neo, Guat Kwan
collection NTU
description With the increasing prevalence and sophistication of malware, there is an urgent need for effective and efficient methods to detect them. Memory forensics has shown promising results in finding malware that can elude traditional security measures. At the same time, machine learning techniques have proven to be effective in identifying unknown malware. By combining both approaches, a robust solution to malware detection can be developed. However, the effectiveness and practicality of these models depend heavily on the quality of the datasets they are trained on. This study aims to assess the effectiveness of machine learning models trained on the CIC-MalMem-2022 dataset for detecting malware in memory images. The study also aims to evaluate the generalisation ability of these models when presented with unseen data and investigate their potential for practical application. 6 classification models were trained and evaluated, and the results showed high scores across multiple metrics in cross-validation. However, when tested with a new set of unseen data, the models produced poor results, and investigation revealed potential issues with the training dataset. The study concluded that dataset quality and key factors, such as operating system versions, system environment variations, and oversampling techniques, are significant factors to consider when developing memory dump datasets for practical use. The study also contributed MemDumpGen, a tool for automating the execution of samples and generation of memory dumps, and MalMemDetector, a proof-of-concept tool that showcases how trained models could be utilised in a practical setting.
first_indexed 2024-10-01T07:22:30Z
format Final Year Project (FYP)
id ntu-10356/165974
institution Nanyang Technological University
language English
last_indexed 2024-10-01T07:22:30Z
publishDate 2023
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1659742023-04-21T15:37:47Z Malware detection in memory images using machine learning Neo, Guat Kwan Luo Jun School of Computer Science and Engineering junluo@ntu.edu.sg Engineering::Computer science and engineering With the increasing prevalence and sophistication of malware, there is an urgent need for effective and efficient methods to detect them. Memory forensics has shown promising results in finding malware that can elude traditional security measures. At the same time, machine learning techniques have proven to be effective in identifying unknown malware. By combining both approaches, a robust solution to malware detection can be developed. However, the effectiveness and practicality of these models depend heavily on the quality of the datasets they are trained on. This study aims to assess the effectiveness of machine learning models trained on the CIC-MalMem-2022 dataset for detecting malware in memory images. The study also aims to evaluate the generalisation ability of these models when presented with unseen data and investigate their potential for practical application. 6 classification models were trained and evaluated, and the results showed high scores across multiple metrics in cross-validation. However, when tested with a new set of unseen data, the models produced poor results, and investigation revealed potential issues with the training dataset. The study concluded that dataset quality and key factors, such as operating system versions, system environment variations, and oversampling techniques, are significant factors to consider when developing memory dump datasets for practical use. The study also contributed MemDumpGen, a tool for automating the execution of samples and generation of memory dumps, and MalMemDetector, a proof-of-concept tool that showcases how trained models could be utilised in a practical setting. Bachelor of Engineering (Computer Science) 2023-04-18T00:31:44Z 2023-04-18T00:31:44Z 2023 Final Year Project (FYP) Neo, G. K. (2023). Malware detection in memory images using machine learning. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/165974 https://hdl.handle.net/10356/165974 en application/pdf Nanyang Technological University
spellingShingle Engineering::Computer science and engineering
Neo, Guat Kwan
Malware detection in memory images using machine learning
title Malware detection in memory images using machine learning
title_full Malware detection in memory images using machine learning
title_fullStr Malware detection in memory images using machine learning
title_full_unstemmed Malware detection in memory images using machine learning
title_short Malware detection in memory images using machine learning
title_sort malware detection in memory images using machine learning
topic Engineering::Computer science and engineering
url https://hdl.handle.net/10356/165974
work_keys_str_mv AT neoguatkwan malwaredetectioninmemoryimagesusingmachinelearning