Research on Malicious TLS Traffic Identification Based on Hybrid Neural Network

doi:10.3778/j.issn.1002-8331.2003-0430

Abstract

Abstract:

To address the problem that using traditional machine learning methods to identify malicious TLS traffic is greatly affected by expert experience, and the identification and classification results are not satisfactory, a Hybrid Neural Network Identification Model（HNNIM） for identification and classification is proposed. The model consists of two layers, the first layer is used to extract features and the second layer is used for identification and classification. In the first layer, the final extracted features are composed of two parts：one part is automatically mined by deep neural network; the other part is selected according to expert experience and further screened by the deep neural network. The second layer aggregates the features screened from the first layer, using a fully connected deep neural network for further learning and fitting. By analyzing a large number of TLS traffic samples, the ClientHello and ServerHello message and TCP protocol interactions information in TLS traffic are selected as the feature space. The experimental results show that the F1 value of HNNIM regarding malicious samples on the malicious TLS traffic identification task is 0.989, which is 0.016, 0.016, 0.019, 0.043 higher than the random forest, SVM, XGBoost, Convolutional Neural Network models, respectively; the average accuracy on the multi-classification task is 89.28%, which is 9.92%, 9.09%, 11.31%, 7.03% higher than the random forest, SVM, XGBoost, Convolutional Neural Network models.

Key words: TLS traffic identification, malicious encryption traffic, traditional machine learning, deep neural network, automatic feature mining

摘要：

针对使用传统机器学习方法来识别恶意TLS流量受到专家经验的影响较大、识别与分类效果不理想的问题，提出了HNNIM（Hybrid Neural Network Identification Model）模型来进行识别与分类。模型由两层组成：第一层用于提取特征，第二层用于识别与分类。第一层中，提取的特征分为两部分，一部分特征由深度神经网络自动挖掘，另一部分特征根据专家经验选取，并由深度神经网络进一步筛选；第二层将第一层筛选出的特征进行聚合，采用全连接的深度神经网络进一步学习和拟合。通过分析大量TLS流量样本，最终选用TLS流量中的ClientHello与ServerHello消息报文与TCP协议交互信息这两部分来作为特征空间。实验的结果表明，HNNIM模型在恶意TLS流量的识别任务上关于恶意样本的F1值为0.989，较随机森林、SVM、XGBoost、卷积神经网络模型，在F1值上分别提升了0.016、0.016、0.019、0.043；在多分类任务上的平均准确率为89.28%，较随机森林、SVM、XGBoost、卷积神经网络模型分别提升了9.92%、9.09%、11.31%、7.03%。

关键词: TLS流量识别, 恶意加密流量, 传统机器学习, 深度神经网络, 特征自动挖掘

WEI Jihong, ZHENG Rongfeng, LIU Jiayong. Research on Malicious TLS Traffic Identification Based on Hybrid Neural Network[J]. Computer Engineering and Applications, 2021, 57(7): 107-114.

韦佶宏，郑荣锋，刘嘉勇. 基于混合神经网络的恶意TLS流量识别研究[J]. 计算机工程与应用, 2021, 57(7): 107-114.

[1]	WANG Lin, CHAI Jiangyun. Research on Deep Neural Network in Multi-scene Vehicle Attribute Recognition [J]. Computer Engineering and Applications, 2021, 57(9): 162-167.
[2]	XU Hao, ZHANG Kai, TIAN Yingjie, CHONG Faguang, WANG Zichao. Review of Deep Neural Network-Based Image Caption [J]. Computer Engineering and Applications, 2021, 57(9): 9-22.
[3]	ZHU Juntao, YAO Guangle, ZHANG Gexiang, LI Jun, YANG Qiang, WANG Sheng, YE Shaoze. Survey of Few Shot Learning of Deep Neural Network [J]. Computer Engineering and Applications, 2021, 57(7): 22-33.
[4]	BAI Zhixu, WANG Hengjun, GUO Kexiang. Summary of Adversarial Examples Techniques Based on Deep Neural Networks [J]. Computer Engineering and Applications, 2021, 57(23): 61-70.
[5]	WANG Wentao, LI Shumei, TANG Jie, LYU Weilong. DDoS Attack Detection Method Based on Probability Graph Model and DNN [J]. Computer Engineering and Applications, 2021, 57(13): 108-115.
[6]	ZHANG Bohan, LING Jie. Improved Malware Detection Method Based on DNN [J]. Computer Engineering and Applications, 2021, 57(10): 81-87.
[7]	ZENG Shulei, LI Xuehua, PAN Chunyu, WANG Yafei, ZHAO Zhongyuan. Resource Allocation Framework Based on Deep Neural Network in Fog Radio Access Network [J]. Computer Engineering and Applications, 2020, 56(24): 78-84.
[8]	XIANG Jun, LIN Ranran, HUANG Ziyuan, HOU Jianhua. Research on Impact of Different Temporal Modeling Methods on Video-Based Person Re-identification [J]. Computer Engineering and Applications, 2020, 56(20): 152-157.
[9]	LI Shuzhi, YU Letao, DENG Xiaohong, LI Zhijun. Neural Network Recommendation Model Combined with Skip-gram Model and Weighted Loss Function [J]. Computer Engineering and Applications, 2020, 56(19): 76-85.
[10]	LIU Youyong, ZHANG Jiangmei, WANG Kunpeng, FENG Xinghua, YANG Xiuhong. Fast Underwater Target Recognition with Unbalanced Data Set [J]. Computer Engineering and Applications, 2020, 56(17): 236-242.
[11]	YUAN Jiajie, ZHANG Ling, CHEN Yunhua. Deep Neural Network Based on Attention Convolution Module for Image Recognition [J]. Computer Engineering and Applications, 2019, 55(8): 9-16.
[12]	JIA Bingbing, CAO Hui, QIN Chijie. Research on Improving Phoneme Recognition Rate Based on Subspace Gaussian Mixture Model and Deep Neural Network Combination [J]. Computer Engineering and Applications, 2019, 55(24): 117-121.
[13]	LIN Pengfei, HE Xiuqing, CHEN Tiantian, WU Huajun, HE Juhou. Prediction of Loss and Teaching?Intervention for Learners in MOOC from Perspective of Deep Learning [J]. Computer Engineering and Applications, 2019, 55(22): 258-264.
[14]	LIU Xinyang, QU Yanwen, ZHOU Qiyun. Self-Attention Credit Scoring Model [J]. Computer Engineering and Applications, 2019, 55(13): 36-41.
[15]	HOU Lixian, LI Yanling, LI Chengcheng. Review of Research on Task-Oriented Spoken Language Understanding [J]. Computer Engineering and Applications, 2019, 55(11): 7-15.

Research on Malicious TLS Traffic Identification Based on Hybrid Neural Network

基于混合神经网络的恶意TLS流量识别研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics