计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (2): 177-177.

• 数据库与信息处理 • 上一篇    下一篇

一种基于交集的聚类组合算法

江永全,杨燕,许翔燕   

  1. 西南交通大学信息学院
  • 收稿日期:2006-01-10 修回日期:1900-01-01 出版日期:2007-01-11 发布日期:2007-01-11
  • 通讯作者: 江永全 river0001

Clustering Combination Algorithm Based on Intersection

YongQuan Jiang,Yan Yang,Xiangyan Xu   

  1. 西南交通大学信息学院
  • Received:2006-01-10 Revised:1900-01-01 Online:2007-01-11 Published:2007-01-11
  • Contact: YongQuan Jiang

摘要: 聚类作为一种无监督的学习,能根据数据间的相似程度自动地进行分类。本文提出的基于交集的聚类组合新方法,借鉴了选举投票的思想。给定同一数据集的不同聚类结果,此算法先求出不同聚类结果中每个簇的对应关系,然后计算这几个聚类结果对应簇的交集,对剩余的有争议对象进行投票,最后把投票之后仍未确定归属的对象分配给最近对象所在的簇,或者不经过投票直接将有争议的对象分配给最近对象所在的簇。实验表明,两种方法都能明显改善聚类质量,投票后得到的结果要略优于不投票的结果。

关键词: 交集, 投票, 聚类, 聚类组合

Abstract: Being an unsupervised learning, clustering is a division of data into groups of similar objects. This paper presents a new intersection-based clustering combination algorithm, which imitates the ways of voting. Assigns some different clustering results of a same data set, this algorithm extracts the corresponding relations of each cluster in these different clustering results first, and then compute the intersection of corresponding clusters of these results, put the remaining disputable objects to vote, finally distribute the objects in abeyance after voting to the nearest object’s cluster, or distribute the remaining disputable objects to the nearest object’s cluster without voting. The experiment indicates both methods can obviously improve the clustering performance; the result with voting is better than the result without voting.

Key words: intersection, vote, clustering, clustering combination