AGCN-Domain: Detecting Malicious Domains with Graph Convolutional Network and Attention Mechanism

Domain Name System (DNS) plays an infrastructure role in providing the directory service for mapping domains to IPs on the Internet. Considering the foundation and openness of DNS, it is not surprising that adversaries register massive domains to enable multiple malicious activities, such as spam, c...

Full description

Bibliographic Details
Main Authors: Xi Luo, Yixin Li, Hongyuan Cheng, Lihua Yin
Format: Article
Language:English
Published: MDPI AG 2024-02-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/12/5/640
Description
Summary:Domain Name System (DNS) plays an infrastructure role in providing the directory service for mapping domains to IPs on the Internet. Considering the foundation and openness of DNS, it is not surprising that adversaries register massive domains to enable multiple malicious activities, such as spam, command and control (C&C), malware distribution, click fraud, etc. Therefore, detecting malicious domains is a significant topic in security research. Although a substantial quantity of research has been conducted, previous work has failed to fuse multiple relationship features to uncover the deep underlying relationships between domains, thus largely limiting their level of performance. In this paper, we proposed <i><b>AGCN-Domain</b></i> to detect malicious domains by combining various relations. The core concept behind our work is to analyze relations between domains according to their behaviors in multiple perspectives and fuse them intelligently. The <i><b>AGCN-Domain</b></i> model utilizes three relationships (client relation, resolution relation, and cname relation) to construct three relationship feature graphs to extract features and intelligently fuse the features extracted from the graphs through an attention mechanism. After the relationship features are extracted from the domain names, they are put into the trained classifier to be processed. Through our experiments, we have demonstrated the performance of our proposed <i><b>AGCN-Domain</b></i> model. With 10% initialized labels in the dataset, our <i><b>AGCN-Domain</b></i> model achieved an accuracy of <b>94.27%</b> and the F1 score of <b>87.93%</b>, significantly outperforming other methods in the comparative experiments.
ISSN:2227-7390