Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (14): 110-115.DOI: 10.3778/j.issn.1002-8331.2005-0161

Previous Articles     Next Articles

Tor Anonymous Traffic Identification Based on Histogram-XGBoost

WANG Tengfei, CAI Manchun, YUE Ting, LU Tianliang   

  1. School of Police Information Engineering and Cyber Security, People’s Public Security University of China, Beijing 100076, China
  • Online:2021-07-15 Published:2021-07-14

Histogram-XGBoost的Tor匿名流量识别

王腾飞,蔡满春,岳婷,芦天亮   

  1. 中国人民公安大学 警务信息工程与网络安全学院,北京 100076

Abstract:

Anonymous communication network is becoming a hidden space for criminals, which brings serious challenges to network supervision. The effective identification of anonymous network traffic is a prerequisite for its effective supervision. In terms of Tor anonymous traffic, Histogram-XGBoost, an effective traffic identification model is proposed. The Histogram-XGBoost model calculates the time-dependent features of the traffic on the flow granularity, and then performs discretization-like preprocessing on these features to improve the robustness. Finally, combined with the idea of integrated learning, the model realizes the identification of Tor anonymous traffic in the smaller feature dimension by XGBoost. Experimental results show that compared with the existing recognition methods, the model proposed in this paper has a greater improvement in accuracy and stability.

Key words: anonymous network, Tor, traffic identification, XGBoost

摘要:

匿名通信网络正成为犯罪分子的隐匿空间,给网络监管带来了严峻的挑战。对匿名网络流量的有效识别是对其有效监管的先决条件。针对Tor匿名流量,提出了一种有效的流量识别模型——Histogram-XGBoost模型。Histogram-XGBoost模型在流粒度上计算获取流量的时间相关性特征,并对这些特征进行类离散化预处理,提升特征的鲁棒性,最后结合集成学习的思想通过XGBoost在较小的特征维度下实现对Tor匿名流量的识别。实验结果表明,与已有的识别方法相比,提出的识别模型在准确率与稳定性上有较大的提升。

关键词: 匿名网络, Tor, 流量识别, XGBoost