Computer Engineering and Applications ›› 2014, Vol. 50 ›› Issue (8): 72-76.

Previous Articles     Next Articles

Improved DBSCAN algorithm and its application

LI Shuangqing, MU Shengdi   

  1. College of Computer Science, Chongqing University, Chongqing 400044, China
  • Online:2014-04-15 Published:2014-05-30

一种改进的DBSCAN算法及其应用

李双庆,慕升弟   

  1. 重庆大学 计算机学院,重庆 400044

Abstract: For massive data such as network traffic, DBSCAN has weakness of greatly time consuming, it has poor clustering effect for some network protocol as well. In the context of network traffic classification via HMM, an improved DBSCAN algorithm is put forward. The algorithm improves the time efficiency and accuracy by reducing the time of querying. The improved algorithm is used to construct the HMM of network traffic automatically based on the divided-and-
conquer strategy. The experimental result shows that the algorithm improves time efficiency greatly. It can correctly build the HMM model for traffic.

Key words: DBSCAN algorithm, Hidden Markov Model(HMM), divide-and-conquer, automatically modeling

摘要: 对网络流量等大规模数据,基于密度的DBSCAN聚类算法收敛时间过长、对某些流量聚类效果欠佳。在基于隐马尔科夫模型(Hidden Markov Model,HMM)的流量识别研究背景下,提出一种改进的DBSCAN算法,从减少每次区域查询次数及查询时间两方面提高算法的时间效率和准确率。并创新性地采用分治策略将新算法应用于自动构建网络协议的HMM模型。实验结果表明,改进的DBSCAN算法在保证聚类准确率的同时大大提高了时间效率,并能通过对网络流数据包进行聚类,正确完成网络协议HMM模型的自动建模。

关键词: DBSCAN算法, 隐马尔科夫模型(HMM), 分治, 自动建模