Sumav: Fully automated malware labeling

Multiple AV engines are used to ensure more effective system protection against malicious files. These AV engines are capable of distinguishing between benign and malicious files, but even if a file of interest is proven to be malicious, it is still necessary to refer to a list of AV labels provided...

Full description

Bibliographic Details
Main Authors: Sangwon Kim, Wookhyun Jung, KyungMin Lee, HyungGeun Oh, Eui Tak Kim
Format: Article
Language:English
Published: Elsevier 2022-12-01
Series:ICT Express
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2405959522000285
_version_ 1811305065295642624
author Sangwon Kim
Wookhyun Jung
KyungMin Lee
HyungGeun Oh
Eui Tak Kim
author_facet Sangwon Kim
Wookhyun Jung
KyungMin Lee
HyungGeun Oh
Eui Tak Kim
author_sort Sangwon Kim
collection DOAJ
description Multiple AV engines are used to ensure more effective system protection against malicious files. These AV engines are capable of distinguishing between benign and malicious files, but even if a file of interest is proven to be malicious, it is still necessary to refer to a list of AV labels provided by each AV engine to determine what family name the malicious file belongs to. However, oftentimes, such AV labels lack a consistent naming scheme, and even family names differ from one AV engine to another.The present study presents Sumav, a fully automated labeling tool that assigns each file a family name based on AV labels. According to previous studies, such a task required prior knowledge or malicious file datasets that had already been labeled. In contrast, Sumav can assign family names with only the AV labels. This system also requires no maintenance and can provide high-quality labeling performance even if sudden changes have been made to the AV label system.
first_indexed 2024-04-13T08:19:47Z
format Article
id doaj.art-0f7ac43cf9244e40ae95482b4061260a
institution Directory Open Access Journal
issn 2405-9595
language English
last_indexed 2024-04-13T08:19:47Z
publishDate 2022-12-01
publisher Elsevier
record_format Article
series ICT Express
spelling doaj.art-0f7ac43cf9244e40ae95482b4061260a2022-12-22T02:54:41ZengElsevierICT Express2405-95952022-12-0184530538Sumav: Fully automated malware labelingSangwon Kim0Wookhyun Jung1KyungMin Lee2HyungGeun Oh3Eui Tak Kim4Data Intelligence Lab, ESTsecurity, Seoul, Republic of KoreaData Intelligence Lab, ESTsecurity, Seoul, Republic of KoreaData Intelligence Lab, ESTsecurity, Seoul, Republic of KoreaNational Security Research Institute, Daejeon, Republic of KoreaData Intelligence Lab, ESTsecurity, Seoul, Republic of Korea; Corresponding author.Multiple AV engines are used to ensure more effective system protection against malicious files. These AV engines are capable of distinguishing between benign and malicious files, but even if a file of interest is proven to be malicious, it is still necessary to refer to a list of AV labels provided by each AV engine to determine what family name the malicious file belongs to. However, oftentimes, such AV labels lack a consistent naming scheme, and even family names differ from one AV engine to another.The present study presents Sumav, a fully automated labeling tool that assigns each file a family name based on AV labels. According to previous studies, such a task required prior knowledge or malicious file datasets that had already been labeled. In contrast, Sumav can assign family names with only the AV labels. This system also requires no maintenance and can provide high-quality labeling performance even if sudden changes have been made to the AV label system.http://www.sciencedirect.com/science/article/pii/S2405959522000285MalwareLabelingAV labelsClusteringClassification
spellingShingle Sangwon Kim
Wookhyun Jung
KyungMin Lee
HyungGeun Oh
Eui Tak Kim
Sumav: Fully automated malware labeling
ICT Express
Malware
Labeling
AV labels
Clustering
Classification
title Sumav: Fully automated malware labeling
title_full Sumav: Fully automated malware labeling
title_fullStr Sumav: Fully automated malware labeling
title_full_unstemmed Sumav: Fully automated malware labeling
title_short Sumav: Fully automated malware labeling
title_sort sumav fully automated malware labeling
topic Malware
Labeling
AV labels
Clustering
Classification
url http://www.sciencedirect.com/science/article/pii/S2405959522000285
work_keys_str_mv AT sangwonkim sumavfullyautomatedmalwarelabeling
AT wookhyunjung sumavfullyautomatedmalwarelabeling
AT kyungminlee sumavfullyautomatedmalwarelabeling
AT hyunggeunoh sumavfullyautomatedmalwarelabeling
AT euitakkim sumavfullyautomatedmalwarelabeling