Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (12): 149-155.DOI: 10.3778/j.issn.1002-8331.1903-0201

Previous Articles     Next Articles

Self-Confirming Clustering Algorithm Based on Residual Error and Density Grid

CHEN Shengfa, JIA Ruiyu   

  1. College of Computer Science and Technology, Anhui University, Hefei 230601, China
  • Online:2020-06-15 Published:2020-06-09

基于残差和密度网格的簇心自确认聚类算法

陈胜发,贾瑞玉   

  1. 安徽大学 计算机科学与技术学院,合肥 230601

Abstract:

In order to solve the problem of clustering by fast search and ?nd of density peaks that relies on truncation distance, computational complexity and need of artificial selection initial seeds, a self-confirming clustering algorithm based on residual error and density grid is proposed. The data object is mapped to a grid, with a mesh object as clustering objects, and remove the grid object without any information. Then compute the density and distance values of grid objects in a specific way. Then the grid objects with cluster centers are determined by residual analysis. The distance with the non-edge points and the self-changing threshold are used to process the edge points and noise points of the grid. The simulation experiment shows that the proposed algorithm has higher clustering accuracy and faster execution speed compared with some other clustering algorithms.

Key words: cluster analysis, grid, density, residual analysis

摘要:

为了解决DPC(Clustering by fast search and ?nd of Density Peaks)算法中依赖截断距离、计算复杂度大和需要人工选取簇心的问题,提出了基于残差和密度网格的簇心自确认聚类算法。将数据对象映射到网格上,用网格对象作为聚类对象,删除不含任何信息的网格对象;用特定方式计算网格对象的密度值和距离值;接着通过残差分析确定含有簇心的网格对象;用与非边缘点的距离和自变动的阈值来处理网格边缘点和噪声点。仿真实验表明所提出的算法与一些其他聚类算法对比,有着较高的聚类精度和较低的时间复杂度。

关键词: 聚类分析, 网格, 密度, 残差分析