Sumav: Fully automated malware labeling
Multiple AV engines are used to ensure more effective system protection against malicious files. These AV engines are capable of distinguishing between benign and malicious files, but even if a file of interest is proven to be malicious, it is still necessary to refer to a list of AV labels provided...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2022-12-01
|
Series: | ICT Express |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2405959522000285 |
_version_ | 1811305065295642624 |
---|---|
author | Sangwon Kim Wookhyun Jung KyungMin Lee HyungGeun Oh Eui Tak Kim |
author_facet | Sangwon Kim Wookhyun Jung KyungMin Lee HyungGeun Oh Eui Tak Kim |
author_sort | Sangwon Kim |
collection | DOAJ |
description | Multiple AV engines are used to ensure more effective system protection against malicious files. These AV engines are capable of distinguishing between benign and malicious files, but even if a file of interest is proven to be malicious, it is still necessary to refer to a list of AV labels provided by each AV engine to determine what family name the malicious file belongs to. However, oftentimes, such AV labels lack a consistent naming scheme, and even family names differ from one AV engine to another.The present study presents Sumav, a fully automated labeling tool that assigns each file a family name based on AV labels. According to previous studies, such a task required prior knowledge or malicious file datasets that had already been labeled. In contrast, Sumav can assign family names with only the AV labels. This system also requires no maintenance and can provide high-quality labeling performance even if sudden changes have been made to the AV label system. |
first_indexed | 2024-04-13T08:19:47Z |
format | Article |
id | doaj.art-0f7ac43cf9244e40ae95482b4061260a |
institution | Directory Open Access Journal |
issn | 2405-9595 |
language | English |
last_indexed | 2024-04-13T08:19:47Z |
publishDate | 2022-12-01 |
publisher | Elsevier |
record_format | Article |
series | ICT Express |
spelling | doaj.art-0f7ac43cf9244e40ae95482b4061260a2022-12-22T02:54:41ZengElsevierICT Express2405-95952022-12-0184530538Sumav: Fully automated malware labelingSangwon Kim0Wookhyun Jung1KyungMin Lee2HyungGeun Oh3Eui Tak Kim4Data Intelligence Lab, ESTsecurity, Seoul, Republic of KoreaData Intelligence Lab, ESTsecurity, Seoul, Republic of KoreaData Intelligence Lab, ESTsecurity, Seoul, Republic of KoreaNational Security Research Institute, Daejeon, Republic of KoreaData Intelligence Lab, ESTsecurity, Seoul, Republic of Korea; Corresponding author.Multiple AV engines are used to ensure more effective system protection against malicious files. These AV engines are capable of distinguishing between benign and malicious files, but even if a file of interest is proven to be malicious, it is still necessary to refer to a list of AV labels provided by each AV engine to determine what family name the malicious file belongs to. However, oftentimes, such AV labels lack a consistent naming scheme, and even family names differ from one AV engine to another.The present study presents Sumav, a fully automated labeling tool that assigns each file a family name based on AV labels. According to previous studies, such a task required prior knowledge or malicious file datasets that had already been labeled. In contrast, Sumav can assign family names with only the AV labels. This system also requires no maintenance and can provide high-quality labeling performance even if sudden changes have been made to the AV label system.http://www.sciencedirect.com/science/article/pii/S2405959522000285MalwareLabelingAV labelsClusteringClassification |
spellingShingle | Sangwon Kim Wookhyun Jung KyungMin Lee HyungGeun Oh Eui Tak Kim Sumav: Fully automated malware labeling ICT Express Malware Labeling AV labels Clustering Classification |
title | Sumav: Fully automated malware labeling |
title_full | Sumav: Fully automated malware labeling |
title_fullStr | Sumav: Fully automated malware labeling |
title_full_unstemmed | Sumav: Fully automated malware labeling |
title_short | Sumav: Fully automated malware labeling |
title_sort | sumav fully automated malware labeling |
topic | Malware Labeling AV labels Clustering Classification |
url | http://www.sciencedirect.com/science/article/pii/S2405959522000285 |
work_keys_str_mv | AT sangwonkim sumavfullyautomatedmalwarelabeling AT wookhyunjung sumavfullyautomatedmalwarelabeling AT kyungminlee sumavfullyautomatedmalwarelabeling AT hyunggeunoh sumavfullyautomatedmalwarelabeling AT euitakkim sumavfullyautomatedmalwarelabeling |