Computer Engineering and Applications ›› 2008, Vol. 44 ›› Issue (31): 169-172.DOI: 10.3778/j.issn.1002-8331.2008.31.049

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Modified k-means algorithm based on Ward’s method and application

QIU Su-lin1,WANG Li-zhen2   

  1. 1.Department of Information Technology,Yunnan Judicature Police Vocational Academy,Kunming 650211,China
    2.School of Information,Yunnan University,Kunming 650091,China
  • Received:2007-12-06 Revised:2008-04-01 Online:2008-11-01 Published:2008-11-01
  • Contact: QIU Su-lin

基于Ward’s方法的k-平均优化算法及其应用

邱苏林1,王丽珍2   

  1. 1.云南司法警官职业学院 信息技术系,昆明 650211
    2.云南大学 信息学院,昆明 650091
  • 通讯作者: 邱苏林

Abstract: A modified k-means algorithm based on Ward’s method is presented thorough analysis the disadvantages of k-means algorithm.First,the authors cluster the sample data with Ward’s method and determine the appropriate number of clusters and the initial cluster centers,and then introduce an algorithm of detecting and deleting outlier.Second,the algorithm is applied to analyze criminal personality type.The experimental evaluation shows that the modified k-means algorithm is superior to the traditional k-means in its efficiency and the clustering effect.

Key words: k-means algorithm, Ward’s method, number of clusters, initial cluster centers, outlier detection

摘要: 通过对k-平均算法存在不足的分析,提出了一种基于Ward’s方法的k-平均优化算法。算法首先在用Ward’s方法对样本数据初步聚类的基础上,确定合适的簇数目、初始聚类中心等k-平均算法的初始参数,并进行孤立点检测、删除;基于上述处理再采用传统k-平均算法进行聚类。将优化的k-平均算法应用到罪犯人格类型分析中,实验结果表明,该算法的效率、聚类效果均明显优于传统k-平均算法。

关键词: k-平均算法, Ward’s方法, 簇数目, 初始聚类中心, 孤立点检测