计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (12): 270-277.DOI: 10.3778/j.issn.1002-8331.2203-0062

• 网络、通信与安全 • 上一篇    下一篇

基于自适应特征选择与KNN的网络流量分类研究

李道全,李腾,李玉秀   

  1. 青岛理工大学 信息与控制工程学院,山东 青岛 266520
  • 出版日期:2023-06-15 发布日期:2023-06-15

Research on Network Traffic Classification Based on Adaptive Feature Selection and KNN

LI Daoquan, LI Teng, LI Yuxiu   

  1. School of Information and Control Engineering, Qingdao University of Technology, Qingdao, Shandong 266520, China
  • Online:2023-06-15 Published:2023-06-15

摘要: 随着互联网技术的不断发展,用户可以在手机或电脑上通过各种应用程序访问互联网,但一些恶意程序产生的异常流量给网络环境带来了危害。针对这一问题,提出了一种基于自适应特征选择与改进KNN的网络流量分类模型。通过引进余弦相似度的互信息法设置了特征筛选倾向度对数据集所有特征进行排序,根据每个特征子集的特征适应度选出最优特征子集,根据各类流量之间的类间距离拆解多分类问题,采用改进KNN算法对流量进行分类。实验结果表明,所提方法在样本不均衡的相似类型流量分类问题上提升效果显著,且整体达到了较好的分类性能。

关键词: 流量分类, K近邻法, 特征适应度, 类间距离

Abstract: With the continuous development of Internet technology, users can access the Internet through various applications on mobile phones or computers, but the abnormal traffic generated by some malicious programs has brought harm to network environment. Aiming at this problem, this paper proposes a network traffic classification model based on adaptive feature selection and improved KNN. The feature selection tendency is set up by introducing the mutual information method of cosine similarity to sort all the features of the data set, and then the optimal feature subset is selected according to the feature fitness of each feature subset, and then disassemble the multi-classification problem according to the interclass distance between various types of traffic, and finally uses the improved KNN algorithm to classify the traffic. The experimental results show that the proposed method has a significant improvement effect on the classification of similar types of traffic with uneven samples, and achieves a better classification performance overall.

Key words: traffic classification, KNN, feature fitness, distance between classes