Effective DGA-Domain Detection and Classification with TextCNN and Additional Features
Malicious codes, such as advanced persistent threat (APT) attacks, do not operate immediately after infecting the system, but after receiving commands from the attacker’s command and control (C&C) server. The system infected by the malicious code tries to communicate with the C&C server thro...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-06-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/9/7/1070 |
_version_ | 1797563715827007488 |
---|---|
author | Chanwoong Hwang Hyosik Kim Hooki Lee Taejin Lee |
author_facet | Chanwoong Hwang Hyosik Kim Hooki Lee Taejin Lee |
author_sort | Chanwoong Hwang |
collection | DOAJ |
description | Malicious codes, such as advanced persistent threat (APT) attacks, do not operate immediately after infecting the system, but after receiving commands from the attacker’s command and control (C&C) server. The system infected by the malicious code tries to communicate with the C&C server through the IP address or domain address of the C&C server. If the IP address or domain address is hard-coded inside the malicious code, it can analyze the malicious code to obtain the address and block access to the C&C server through security policy. In order to circumvent this address blocking technique, domain generation algorithms are included in the malware to dynamically generate domain addresses. The domain generation algorithm (DGA) generates domains randomly, so it is very difficult to identify and block malicious domains. Therefore, this paper effectively detects and classifies unknown DGA domains. We extract features that are effective for TextCNN-based label prediction, and add additional domain knowledge-based features to improve our model for detecting and classifying DGA-generated malicious domains. The proposed model achieved 99.19% accuracy for DGA classification and 88.77% accuracy for DGA class classification. We expect that the proposed model can be applied to effectively detect and block DGA-generated domains. |
first_indexed | 2024-03-10T18:47:23Z |
format | Article |
id | doaj.art-54e47ff6a9a44d8ea98875ef9556329e |
institution | Directory Open Access Journal |
issn | 2079-9292 |
language | English |
last_indexed | 2024-03-10T18:47:23Z |
publishDate | 2020-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Electronics |
spelling | doaj.art-54e47ff6a9a44d8ea98875ef9556329e2023-11-20T05:24:43ZengMDPI AGElectronics2079-92922020-06-0197107010.3390/electronics9071070Effective DGA-Domain Detection and Classification with TextCNN and Additional FeaturesChanwoong Hwang0Hyosik Kim1Hooki Lee2Taejin Lee3Department of Information Security, Hoseo University, Asan 31499, KoreaDepartment of Information Security, Hoseo University, Asan 31499, KoreaDepartment of Cyber Security Engineering, Konyang University, Nonsan 32992, KoreaDepartment of Information Security, Hoseo University, Asan 31499, KoreaMalicious codes, such as advanced persistent threat (APT) attacks, do not operate immediately after infecting the system, but after receiving commands from the attacker’s command and control (C&C) server. The system infected by the malicious code tries to communicate with the C&C server through the IP address or domain address of the C&C server. If the IP address or domain address is hard-coded inside the malicious code, it can analyze the malicious code to obtain the address and block access to the C&C server through security policy. In order to circumvent this address blocking technique, domain generation algorithms are included in the malware to dynamically generate domain addresses. The domain generation algorithm (DGA) generates domains randomly, so it is very difficult to identify and block malicious domains. Therefore, this paper effectively detects and classifies unknown DGA domains. We extract features that are effective for TextCNN-based label prediction, and add additional domain knowledge-based features to improve our model for detecting and classifying DGA-generated malicious domains. The proposed model achieved 99.19% accuracy for DGA classification and 88.77% accuracy for DGA class classification. We expect that the proposed model can be applied to effectively detect and block DGA-generated domains.https://www.mdpi.com/2079-9292/9/7/1070securitydomain generation algorithmTextCNNdomain featuresclassification |
spellingShingle | Chanwoong Hwang Hyosik Kim Hooki Lee Taejin Lee Effective DGA-Domain Detection and Classification with TextCNN and Additional Features Electronics security domain generation algorithm TextCNN domain features classification |
title | Effective DGA-Domain Detection and Classification with TextCNN and Additional Features |
title_full | Effective DGA-Domain Detection and Classification with TextCNN and Additional Features |
title_fullStr | Effective DGA-Domain Detection and Classification with TextCNN and Additional Features |
title_full_unstemmed | Effective DGA-Domain Detection and Classification with TextCNN and Additional Features |
title_short | Effective DGA-Domain Detection and Classification with TextCNN and Additional Features |
title_sort | effective dga domain detection and classification with textcnn and additional features |
topic | security domain generation algorithm TextCNN domain features classification |
url | https://www.mdpi.com/2079-9292/9/7/1070 |
work_keys_str_mv | AT chanwoonghwang effectivedgadomaindetectionandclassificationwithtextcnnandadditionalfeatures AT hyosikkim effectivedgadomaindetectionandclassificationwithtextcnnandadditionalfeatures AT hookilee effectivedgadomaindetectionandclassificationwithtextcnnandadditionalfeatures AT taejinlee effectivedgadomaindetectionandclassificationwithtextcnnandadditionalfeatures |