计算机工程与应用 ›› 2018, Vol. 54 ›› Issue (14): 82-88.DOI: 10.3778/j.issn.1002-8331.1703-0253

• 大数据与云计算 • 上一篇    下一篇

基于稀疏交界最大密度连通的模糊聚类方法

仇功达1,何  明1,祝朝政1,杨  杰2,刘  勇1   

  1. 1.解放军理工大学 指挥信息系统学院,南京 210000
    2.江苏省公安厅 科技信息化处,南京 210000
  • 出版日期:2018-07-15 发布日期:2018-08-06

Fuzzy clustering based on connected point with max density in sparse border

QIU Gongda1, HE Ming1, ZHU Chaozheng1, YANG Jie2, LIU Yong1   

  1. 1.College of Command Information Systems, PLA Science and Technology University, Nanjing 210000, China
    2.Science and Technology Information Office, Public Security Bureau of Jiangsu Province, Nanjing 210000, China
  • Online:2018-07-15 Published:2018-08-06

摘要: 为解决现有密度聚类算法中参数设置依赖经验、复杂密度环境下聚类精度不高等问题,提出了基于簇间最大密度连通点进行密度簇分割与合并的模糊聚类方法。基于高斯混合模型计算数据点密度,形成高维离散密度空间,通过低精度网格连续数据空间,结合插值算法赋予空白网格相应密度,构建连续高维密度空间。对数据点按密度排序后,利用能否从大于当前密度的点集中连续可达识别密度极大值点,再以密度序实现极大值点的邻域扩张,以扩张矛盾实现稀疏交界处最大密度连通点识别、密度簇分割。最后基于最大密度连通点计算密度簇间隶属度,设定隶属度阈值,实现相关邻簇的合并,完成聚类。通过与多种密度聚类算法进行仿真对比验证,该算法大大降低了经验参数的依赖性,具有全局统一的合并隶属度,提升了多密度下的类识别能力。

关键词: 高斯混合模型, 簇识别, 隶属度, 最大密度连通点

Abstract: In order to solve the problems that parameter settings depend on experience and low clustering precision in complex density environment in many clustering algorithms, this paper presents a fuzzy density clustering method by separating and combining basal density clusters based on the connected point with max density in sparse border. After calculating the density of data based on Gaussian mixture model, it buils a high dimensional discrete space of density, then transforms it into a high dimensional continuous space of density by using the low accuracy mesh to connect data and calculating the density of low accuracy null mesh with interpolation algorithm. After sorting the data according to the density, if a point can’t be found whose density is higher than the current point is next to the current point, it will be regarded as a maximum point, then class expansion is completed. With the help of contradiction during the expansion it can identify the connected point with max density in sparse border, and isolate the basal classes. On the basis of membership grade which can be calculated from the connected point with max density and maximum points, it completes the combination of basal classes and gets the final clusters. Finally, compared with a variety of density clustering algorithms, the simulation results verify that the algorithm reduces the dependence on experience parameters, the classification is more objective and efficient, the algorithm can plan with global unified, and it effectively improves the clustering accuracy.

Key words: Gaussian mixture model, recognition of clusters, membership grade, connected point with the max density