基于混合神经网络的恶意TLS流量识别研究

doi:10.3778/j.issn.1002-8331.2003-0430

计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (7): 107-114.DOI: 10.3778/j.issn.1002-8331.2003-0430

基于混合神经网络的恶意TLS流量识别研究

韦佶宏，郑荣锋，刘嘉勇

1.四川大学网络空间安全学院，成都 610065
2.四川大学电子信息学院，成都 610065

出版日期:2021-04-01 发布日期:2021-04-02

Research on Malicious TLS Traffic Identification Based on Hybrid Neural Network

WEI Jihong, ZHENG Rongfeng, LIU Jiayong

1.College of Cybersecurity, Sichuan University, Chengdu 610065, China
2.College of Electronics and Information Engineering, Sichuan University, Chengdu 610065, China

Online:2021-04-01 Published:2021-04-02

摘要/Abstract

摘要：

针对使用传统机器学习方法来识别恶意TLS流量受到专家经验的影响较大、识别与分类效果不理想的问题，提出了HNNIM（Hybrid Neural Network Identification Model）模型来进行识别与分类。模型由两层组成：第一层用于提取特征，第二层用于识别与分类。第一层中，提取的特征分为两部分，一部分特征由深度神经网络自动挖掘，另一部分特征根据专家经验选取，并由深度神经网络进一步筛选；第二层将第一层筛选出的特征进行聚合，采用全连接的深度神经网络进一步学习和拟合。通过分析大量TLS流量样本，最终选用TLS流量中的ClientHello与ServerHello消息报文与TCP协议交互信息这两部分来作为特征空间。实验的结果表明，HNNIM模型在恶意TLS流量的识别任务上关于恶意样本的F1值为0.989，较随机森林、SVM、XGBoost、卷积神经网络模型，在F1值上分别提升了0.016、0.016、0.019、0.043；在多分类任务上的平均准确率为89.28%，较随机森林、SVM、XGBoost、卷积神经网络模型分别提升了9.92%、9.09%、11.31%、7.03%。

关键词: TLS流量识别, 恶意加密流量, 传统机器学习, 深度神经网络, 特征自动挖掘

Abstract:

To address the problem that using traditional machine learning methods to identify malicious TLS traffic is greatly affected by expert experience, and the identification and classification results are not satisfactory, a Hybrid Neural Network Identification Model（HNNIM） for identification and classification is proposed. The model consists of two layers, the first layer is used to extract features and the second layer is used for identification and classification. In the first layer, the final extracted features are composed of two parts：one part is automatically mined by deep neural network; the other part is selected according to expert experience and further screened by the deep neural network. The second layer aggregates the features screened from the first layer, using a fully connected deep neural network for further learning and fitting. By analyzing a large number of TLS traffic samples, the ClientHello and ServerHello message and TCP protocol interactions information in TLS traffic are selected as the feature space. The experimental results show that the F1 value of HNNIM regarding malicious samples on the malicious TLS traffic identification task is 0.989, which is 0.016, 0.016, 0.019, 0.043 higher than the random forest, SVM, XGBoost, Convolutional Neural Network models, respectively; the average accuracy on the multi-classification task is 89.28%, which is 9.92%, 9.09%, 11.31%, 7.03% higher than the random forest, SVM, XGBoost, Convolutional Neural Network models.

Key words: TLS traffic identification, malicious encryption traffic, traditional machine learning, deep neural network, automatic feature mining

韦佶宏，郑荣锋，刘嘉勇. 基于混合神经网络的恶意TLS流量识别研究[J]. 计算机工程与应用, 2021, 57(7): 107-114.

WEI Jihong, ZHENG Rongfeng, LIU Jiayong. Research on Malicious TLS Traffic Identification Based on Hybrid Neural Network[J]. Computer Engineering and Applications, 2021, 57(7): 107-114.

[1]	许昊，张凯，田英杰，种法广，王子超. 深度神经网络图像描述综述[J]. 计算机工程与应用, 2021, 57(9): 9-22.
[2]	王林，柴江云. 深度神经网络在多场景车辆属性识别中的研究[J]. 计算机工程与应用, 2021, 57(9): 162-167.
[3]	祝钧桃，姚光乐，张葛祥，李军，杨强，王胜，叶绍泽. 深度神经网络的小样本学习综述[J]. 计算机工程与应用, 2021, 57(7): 22-33.
[4]	徐志京，汪毅. 青光眼眼底图像的迁移学习分类方法[J]. 计算机工程与应用, 2021, 57(3): 144-149.
[5]	白祉旭，王衡军，郭可翔. 基于深度神经网络的对抗样本技术综述[J]. 计算机工程与应用, 2021, 57(23): 61-70.
[6]	王文涛，李树梅，汤婕，吕伟龙. 结合概率图模型与DNN的DDoS攻击检测方法[J]. 计算机工程与应用, 2021, 57(13): 108-115.
[7]	张柏翰，凌捷. 改进的基于DNN的恶意软件检测方法[J]. 计算机工程与应用, 2021, 57(10): 81-87.
[8]	曾舒磊，李学华，潘春雨，王亚飞，赵中原. 雾无线接入网中基于神经网络的资源分配方案[J]. 计算机工程与应用, 2020, 56(24): 78-84.
[9]	赵丽萍，袁霄，祝承，赵晓琦，杨仕虎，梁平，鲁小丫，谭颖. 面向图像分类的残差网络进展研究[J]. 计算机工程与应用, 2020, 56(20): 9-19.
[10]	项俊，林染染，黄子源，侯建华. 时域模型对视频行人重识别性能影响的研究[J]. 计算机工程与应用, 2020, 56(20): 152-157.
[11]	李淑芝，余乐陶，邓小鸿，李志军. 结合Skip-gram和加权损失函数的神经网络推荐模型[J]. 计算机工程与应用, 2020, 56(19): 76-85.
[12]	刘有用，张江梅，王坤朋，冯兴华，杨秀洪. 不平衡数据集下的水下目标快速识别方法[J]. 计算机工程与应用, 2020, 56(17): 236-242.
[13]	袁嘉杰，张灵，陈云华. 基于注意力卷积模块的深度神经网络图像识别[J]. 计算机工程与应用, 2019, 55(8): 9-16.
[14]	贾兵兵，曹辉，秦驰杰. 基于SGMM和DNN结合提高音素识别率的研究[J]. 计算机工程与应用, 2019, 55(24): 117-121.
[15]	林鹏飞，何秀青，陈甜甜，吴华君，何聚厚. 深度学习视阈下MOOC学习者流失预测及干预研究[J]. 计算机工程与应用, 2019, 55(22): 258-264.

基于混合神经网络的恶意TLS流量识别研究

Research on Malicious TLS Traffic Identification Based on Hybrid Neural Network

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics