计算机工程与应用 ›› 2016, Vol. 52 ›› Issue (22): 81-85.

• 大数据与云计算 • 上一篇    下一篇

基于密度与网格的聚类算法的改进

邢长征,张  园   

  1. 辽宁工程技术大学 电子与信息工程学院,辽宁 葫芦岛 125105
  • 出版日期:2016-11-15 发布日期:2016-12-02

Improved clustering algorithm?based on density and grid

XING Changzheng, ZHANG Yuan   

  1. School of Electronic and Information Engineering, Liaoning Technical University, Huludao, Liaoning 125105, China
  • Online:2016-11-15 Published:2016-12-02

摘要: 针对传统基于密度树网格聚类算法中存在人为设置密度阈值、重复查询邻域内对象以及边界点处理不当等问题,提出了一种改进的基于密度与网格的聚类算法。该算法首先将全部网格的平均密度值作为其密度阈值,避免了人为设置密度阈值的偏差;其次采用自适应算法确定密度半径,使其能适用到动态的聚类中;然后采用对邻域外未标记的点作为下一个核心点,依据分类情况进行扩展,对邻域对象的查询不再出现重复;最后对边界点进行了处理,增强了算法的聚类精度。实验结果表明,改进的算法在时间的效率及精度方面均有提高,并且能更好地适应聚类的动态性。

关键词: 重心点, 密度, 网格, 动态, 聚类, 边界点

Abstract: In order to solve the existence problem of the manual set threshold and repeated queries in the neighborhood and processing boundary point in the traditional grid clustering algorithm based on density tree, the article proposes the improvement of clustering algorithm based on density tree and grid. This algorithm uses the average density of calculation all the grid values as density threshold, which can avoid bias artificially set the density threshold;In addition, this algorithm uses the adaptive algorithm to determine the density of the radius, which can be applied to dynamic clustering; Thus, in the querying neighborhood point, using unlabeled points of neighborhood outside as next core can avoid repeated queries in the neighborhood; The last, processing boundary point is able to enhance the accuracy of clustering algorithm. The experiments show that this algorithm greatly improved time and precision efficiency, and can better adapt to dynamic clustering.

Key words: center of gravity, density, grid, dynamic, clustering, boundary point