计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (22): 278-287.DOI: 10.3778/j.issn.1002-8331.2407-0268

• 网络、通信与安全 • 上一篇    下一篇

基于CRNet的可读DGA恶意域名检测模型

赵宏,丁艳娇,王伟杰   

  1. 兰州理工大学 计算机与通信学院,兰州 730050
  • 出版日期:2025-11-15 发布日期:2025-11-14

Readable DGA Malicious Domain Name Detection Model Based on CRNet

ZHAO Hong, DING Yanjiao, WANG Weijie   

  1. School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China
  • Online:2025-11-15 Published:2025-11-14

摘要: 针对现有域名检测模型对部分可读DGA(domain generation algorithm)恶意域名检测性能不佳的问题,提出一种基于卷积保留网络(convolutional retentive network,CRNet)的可读DGA恶意域名检测模型。首先提出轻量级保留网络(lightweight retentive network,LRN)捕获域名字符串的全局语义特征,充分挖掘可读DGA域名与合法域名之间的上下文特征差异。其中多尺度保留(multi-scale retention,MSR)机制捕获域名字符串的浅层语义信息;为深入挖掘深层语义信息,设计了一种轻量级卷积前馈网络(lightweight convolutional feed forward network, LCFFN),通过在前馈网络(feed forward network,FFN)的两个线性层间引入深度可分离卷积(depthwise separable convolution,DSC)优化特征信息,并采用Delight变换模块降低域名特征表示维度,缓解FFN中相邻层之间语义信息高度冗余的问题。其次采用卷积神经网络(convolutional neural network,CNN)捕获域名字符串中不同字符间的组合特征。最后将LRN与CNN相结合,充分利用域名的全局语义特征和字符组合特征,以提升可读DGA域名检测的效果。在Majestic Million合法域名数据集和360 DGA恶意域名数据集上进行实验,结果表明,相较于当前先进的DGA域名检测模型,CRNet在提升检测效率的同时,对于可读DGA域名检测的F1分数提升了0.59%~3.48%,随机域名检测的F1分数提升了0.32%~1.42%。

关键词: 可读DGA域名, 轻量级保留网络, 轻量级卷积前馈网络, 多尺度保留, 全局语义特征

Abstract: To address the limitations of existing domain name detection models in identifying readable DGA (domain generation algorithm) malicious domains, a convolutional retentive network (CRNet) for detecting such domains is proposed. Firstly, the lightweight retentive network (LRN) is introduced to capture the global semantic features of domain name strings and explore the contextual differences between readable DGA domains and legitimate domains. The multi-scale retention (MSR) mechanism captures shallow semantic information, while the lightweight convolutional feed forward network (LCFFN) is designed to extract deeper features. By incorporating depthwise separable convolution (DSC) between two linear layers in the feed forward network (FFN), feature extraction is optimized, and the Delight transformation module reduces dimensionality, addressing the issue of redundancy between adjacent layers in the FFN. Additionally, a convolutional neural network (CNN) captures character combination features within domain name strings. By combining LRN and CNN, the model effectively leverages both global semantic features and local character combinations, improving the detection of readable DGA domains. Experiments on the Majestic Million legitimate domain dataset and the 360 DGA malicious domain dataset demonstrate that CRNet enhances detection efficiency, achieving F1-Score improvements of 0.59%-3.48% for readable DGA domains and 0.32%-1.42% for random DGA domains compared to state-of-the-art models.

Key words: readable DGA domain names, lightweight retentive network, lightweight convolutional feed forward network, multi-scale retention, global semantic features