计算机工程与应用 ›› 2018, Vol. 54 ›› Issue (8): 131-136.DOI: 10.3778/j.issn.1002-8331.1611-0427

• 模式识别与人工智能 • 上一篇    下一篇

基于数据场的类簇中心选取及其聚类

朱振国,冯应柱   

  1. 重庆交通大学 信息科学与工程学院,重庆 400074
  • 出版日期:2018-04-15 发布日期:2018-05-02

Clustering center selection and clustering based on data field

ZHU Zhenguo, FENG Yingzhu   

  1. College of Information Science and Engineering, Chongqing Jiaotong University, Chongqing 400074, China
  • Online:2018-04-15 Published:2018-05-02

摘要: 针对现有聚类算法普遍存在聚类质量低、参数依赖性大、孤立点难识别等问题,提出一种基于数据场的聚类算法。该算法通过计算每个数据对象点的势值,根据类簇中心的势值比周围邻居的势值大,且与其他类簇中心有相对较大距离的特点,确定类簇中心;根据孤立点的势值等于零的特点,选出孤立点;最后将其他数据对象点划分到比自身势值大且最近邻的类簇中,从而实现聚类。仿真实验表明,该算法在不需要人为调参的情况下准确找出类簇中心和孤立点,聚类效果优良,且与数据集的形状无关。

关键词: 类簇中心, 数据场, 聚类, 孤立点

Abstract: In view of the existing clustering algorithms with widespread low clustering quality, parameter dependency and outlier effects obvious, in this paper, a clustering method based on the field data is proposed. The algorithm has its basis in the assumptions that cluster centers are surrounded by neighbors with lower local potential and that they are at a relatively large distance from any points with a higher local potential. According to the characteristics that the potential value of isolated point is equal to zero, remove the outlier and finally the other object points are divided into larger than its potential value and nearest neighbor type of clusters, so as to achieve clustering. Simulation results show that the proposed algorithm is effective and has no effect on the shape of the data set, and it can find out clusters center and outliers accurately without artificial parameters.

Key words: cluster center, data field, clustering, outlier