Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (11): 100-106.DOI: 10.3778/j.issn.1002-8331.2012-0475

• Network, Communication and Security • Previous Articles     Next Articles

Malicious Domain Names Detection by Improved Relief-C5.0

MA Donglin, ZHANG Shuhuan, ZHAO Hong   

  1. School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China
  • Online:2022-06-01 Published:2022-06-01

改进Relief-C5.0的恶意域名检测算法

马栋林,张澍寰,赵宏   

  1. 兰州理工大学 计算机与通信学院,兰州 730050

Abstract: Aiming at the problems of the high computational complexity, low real-time performance, and low accuracy of classification models in the current malicious domain name detection algorithms, a malicious domain name detection algorithm by Rf-C5(Relief-C5.0) is proposed. Firstly, the global URL features of the domain names to be tested are extracted. Then, the improved Relief algorithm is used to calculate the weight of the extracted features, and the features are prioritized according to the weight values. Finally, the key features of the top 20 weighted values are selected as the input of C5.0 classifier to classify legitimate domain names and malicious domain names. Experimental results show that under the large sample data set, compared with the current mainstream malicious domain name detection algorithms, the detection accuracy of Rf-C5 model increases by 1.58~4.91?percentage points on the basis of increasing the average detection rate.

Key words: malicious domain name, URL features, improved Relief algorithm, C5.0 classifier

摘要: 针对目前恶意域名检测算法中分类模型计算复杂度较大、实时性不强以及准确率不高等问题,提出了Rf-C5(Relief-C5.0)恶意域名检测算法模型。提取待测域名的全局URL特征,根据提取的特征按照改进的Relief算法进行权重计算,并依据权重值进行优先级排序;选取权重值排名前20的关键特征作为C5.0分类器的输入端,进行合法域名与恶意域名的分类。实验结果表明,在大样本数据集下,Rf-C5模型与当前主流恶意域名检测算法相比,在提高平均检测速率的基础上,检测准确率提高了1.58~4.91个百分点。

关键词: 恶意域名, URL特征, 改进的Relief算法, C5.0分类器