Effective DGA-Domain Detection and Classification with TextCNN and Additional Features

Malicious codes, such as advanced persistent threat (APT) attacks, do not operate immediately after infecting the system, but after receiving commands from the attacker’s command and control (C&C) server. The system infected by the malicious code tries to communicate with the C&C server thro...

Full description

Bibliographic Details
Main Authors:	Chanwoong Hwang, Hyosik Kim, Hooki Lee, Taejin Lee
Format:	Article
Language:	English
Published:	MDPI AG 2020-06-01
Series:	Electronics
Subjects:	security domain generation algorithm TextCNN domain features classification
Online Access:	https://www.mdpi.com/2079-9292/9/7/1070

_version_	1797563715827007488
author	Chanwoong Hwang Hyosik Kim Hooki Lee Taejin Lee
author_facet	Chanwoong Hwang Hyosik Kim Hooki Lee Taejin Lee
author_sort	Chanwoong Hwang
collection	DOAJ
description	Malicious codes, such as advanced persistent threat (APT) attacks, do not operate immediately after infecting the system, but after receiving commands from the attacker’s command and control (C&C) server. The system infected by the malicious code tries to communicate with the C&C server through the IP address or domain address of the C&C server. If the IP address or domain address is hard-coded inside the malicious code, it can analyze the malicious code to obtain the address and block access to the C&C server through security policy. In order to circumvent this address blocking technique, domain generation algorithms are included in the malware to dynamically generate domain addresses. The domain generation algorithm (DGA) generates domains randomly, so it is very difficult to identify and block malicious domains. Therefore, this paper effectively detects and classifies unknown DGA domains. We extract features that are effective for TextCNN-based label prediction, and add additional domain knowledge-based features to improve our model for detecting and classifying DGA-generated malicious domains. The proposed model achieved 99.19% accuracy for DGA classification and 88.77% accuracy for DGA class classification. We expect that the proposed model can be applied to effectively detect and block DGA-generated domains.
first_indexed	2024-03-10T18:47:23Z
format	Article
id	doaj.art-54e47ff6a9a44d8ea98875ef9556329e
institution	Directory Open Access Journal
issn	2079-9292
language	English
last_indexed	2024-03-10T18:47:23Z
publishDate	2020-06-01
publisher	MDPI AG
record_format	Article
series	Electronics
spelling	doaj.art-54e47ff6a9a44d8ea98875ef9556329e2023-11-20T05:24:43ZengMDPI AGElectronics2079-92922020-06-0197107010.3390/electronics9071070Effective DGA-Domain Detection and Classification with TextCNN and Additional FeaturesChanwoong Hwang0Hyosik Kim1Hooki Lee2Taejin Lee3Department of Information Security, Hoseo University, Asan 31499, KoreaDepartment of Information Security, Hoseo University, Asan 31499, KoreaDepartment of Cyber Security Engineering, Konyang University, Nonsan 32992, KoreaDepartment of Information Security, Hoseo University, Asan 31499, KoreaMalicious codes, such as advanced persistent threat (APT) attacks, do not operate immediately after infecting the system, but after receiving commands from the attacker’s command and control (C&C) server. The system infected by the malicious code tries to communicate with the C&C server through the IP address or domain address of the C&C server. If the IP address or domain address is hard-coded inside the malicious code, it can analyze the malicious code to obtain the address and block access to the C&C server through security policy. In order to circumvent this address blocking technique, domain generation algorithms are included in the malware to dynamically generate domain addresses. The domain generation algorithm (DGA) generates domains randomly, so it is very difficult to identify and block malicious domains. Therefore, this paper effectively detects and classifies unknown DGA domains. We extract features that are effective for TextCNN-based label prediction, and add additional domain knowledge-based features to improve our model for detecting and classifying DGA-generated malicious domains. The proposed model achieved 99.19% accuracy for DGA classification and 88.77% accuracy for DGA class classification. We expect that the proposed model can be applied to effectively detect and block DGA-generated domains.https://www.mdpi.com/2079-9292/9/7/1070securitydomain generation algorithmTextCNNdomain featuresclassification
spellingShingle	Chanwoong Hwang Hyosik Kim Hooki Lee Taejin Lee Effective DGA-Domain Detection and Classification with TextCNN and Additional Features Electronics security domain generation algorithm TextCNN domain features classification
title	Effective DGA-Domain Detection and Classification with TextCNN and Additional Features
title_full	Effective DGA-Domain Detection and Classification with TextCNN and Additional Features
title_fullStr	Effective DGA-Domain Detection and Classification with TextCNN and Additional Features
title_full_unstemmed	Effective DGA-Domain Detection and Classification with TextCNN and Additional Features
title_short	Effective DGA-Domain Detection and Classification with TextCNN and Additional Features
title_sort	effective dga domain detection and classification with textcnn and additional features
topic	security domain generation algorithm TextCNN domain features classification
url	https://www.mdpi.com/2079-9292/9/7/1070
work_keys_str_mv	AT chanwoonghwang effectivedgadomaindetectionandclassificationwithtextcnnandadditionalfeatures AT hyosikkim effectivedgadomaindetectionandclassificationwithtextcnnandadditionalfeatures AT hookilee effectivedgadomaindetectionandclassificationwithtextcnnandadditionalfeatures AT taejinlee effectivedgadomaindetectionandclassificationwithtextcnnandadditionalfeatures

Effective DGA-Domain Detection and Classification with TextCNN and Additional Features

Similar Items