Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (5): 109-117.DOI: 10.3778/j.issn.1002-8331.1907-0409

Previous Articles     Next Articles

Unknown Application Layer Protocol Recognition Based on Adaptive Clustering

HONG Zheng, GONG Qiyuan, FENG Wenbo, LI Yihao   

  1. College of Command and Control Engineering, Army Engineering University of PLA, Nanjing 210042, China
  • Online:2020-03-01 Published:2020-03-06

自适应聚类的未知应用层协议识别方法

洪征,龚启缘,冯文博,李毅豪   

  1. 中国人民解放军陆军工程大学 指挥控制工程学院,南京 210042

Abstract:

Recognition of application layer protocols is to extract key features that can identify application layer protocols from network traffic, and divide the application layer protocol data according to the key features. Aiming to improve the recognition rate of unknown application layer protocols, an application layer protocol recognition method based on adaptive clustering is proposed in the paper. The method takes advantage of the traditional AGNES hierarchical clustering algorithm, and analyzes payload characteristics of the application layer protocols. The application layer protocols are clustered based on the payload similarity. Meanwhile, the method improves the clustering efficiency by dividing the similarity calculation in clustering algorithm into two parts: similarity calculation between application layer protocol data and similarity calculation between clusters, and avoids redundant calculation. Experimental results show that the algorithm can cluster network traffic of unknown application layer protocols efficiently and accurately.

Key words: protocol recognition, hierarchical clustering, network traffic, network management

摘要:

应用层协议识别是指从承载应用层协议数据的网络流量中提取出可以标识应用层协议的关键特征,并以这些关键特征为基础,将同种类型的应用层协议数据划分在一起。针对现有网络流量识别方法对未知应用层协议识别率低的问题,提出了一种自适应聚类的未知应用层协议识别方法。该方法以传统的AGNES层次聚类算法为基础,依据网络流应用层协议数据的负载特征,基于相似度对应用层协议进行聚类。方法将聚类算法中相似度计算划分为聚类前应用层协议数据间的相似度计算和聚类中簇间的相似度计算两部分,避免了重复性地计算应用层协议数据间的相似度,提升了算法的聚类效率。实验结果表明所提出的方法能够高效准确地对未知协议的网络流量进行识别。

关键词: 协议识别, 层次聚类, 网络流量, 网络管理