计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (10): 147-149.

• 数据库、信号与信息处理 • 上一篇    下一篇

K-means算法的初始聚类中心的优化

赖玉霞,刘建平   

  1. 浙江理工大学 信息电子学院,杭州 310018
  • 收稿日期:2007-07-16 修回日期:2007-10-17 出版日期:2008-04-01 发布日期:2008-04-01
  • 通讯作者: 赖玉霞

Optimization study on initial center of K-means algorithm

LAI Yu-xia,LIU Jian-ping   

  1. College of Electronic Information,Zhejiang Sci-Tech University,Hangzhou 310018,China
  • Received:2007-07-16 Revised:2007-10-17 Online:2008-04-01 Published:2008-04-01
  • Contact: LAI Yu-xia

摘要: 传统的K-means算法对初始聚类中心敏感,聚类结果随不同的初始输入而波动,针对K-means算法存在的问题,提出了基于密度的改进的K-means算法,该算法采取聚类对象分布密度方法来确定初始聚类中心,选择相互距离最远的K个处于高密度区域的点作为初始聚类中心,理论分析与实验结果表明,改进的算法能取得更好的聚类结果。

关键词: 聚类, K-means算法, 密度, 聚类中心, 高密度区域

Abstract: The traditional K-means algorithm has sensitivity to the initial centers.To solve this problem,an improved K-means algorithm based on density is presente.First it computes the density of the area where the data object belongs to;then finds K data objects all of which are belong to high density area and the most far away to each other,using these K data objects as the initial start centers.Theory analysis and experimental results demonstrate that the improved algorithm can get better clustering .and eliminate the sensitivity to the initial start centers.

Key words: clustering, K-means algorithm, density, clustering center, high density area