计算机工程与应用 ›› 2010, Vol. 46 ›› Issue (13): 132-134.DOI: 10.3778/j.issn.1002-8331.2010.13.039

• 数据库、信号与信息处理 • 上一篇    下一篇

基于二部图的概念聚类研究

史金成1,胡学钢2   

  1. 1.铜陵学院 数学与计算机科学系,安徽 铜陵 244000
    2.合肥工业大学 计算机与信息学院,合肥 230009
  • 收稿日期:2009-07-30 修回日期:2009-09-14 出版日期:2010-05-01 发布日期:2010-05-01
  • 通讯作者: 史金成

Research on conceptual clustering based on bipartite graph

SHI Jin-cheng1,HU Xue-gang2   

  1. 1.Department of Mathematics and Computer Science,Tongling College,Tongling,Anhui 244000,China
    2.School of Computer & Information,Hefei Technology University,Hefei 230009,China
  • Received:2009-07-30 Revised:2009-09-14 Online:2010-05-01 Published:2010-05-01
  • Contact: SHI Jin-cheng

摘要: 传统概念聚类算法中簇的更新和存储不仅依赖于对象数目和属性数目,而且依赖于属性值的数目,这种局限性使其不适用于大型数据集。提出一种新的基于二部图的概念聚类算法(BGBCC),该算法通过获得二部图的近似极大ε二元组集,有效地进行数据与属性的关联聚类。实验表明,该算法能得到较好的聚类结果,且能在较短的时间内进行大型数据集的概念聚类。

关键词: 机器学习, 概念聚类, 关联聚类, 二部图, 二元组

Abstract: In traditional conceptual clustering algorithms,clusters updating and store depend not only on the number of objects and their attributes but also on the number of attribute values.This limitation results in that traditional conceptual clustering algorithms are not suitable for large data set.A new conceptual clustering algorithm based on bipartite graph called BGBCC is proposed,which clusters data together with their attributes effectively through obtaining approximate collection of maximum ε-bicliques.The experiments and analysis show that the algorithm can obtain good clustering descriptions,and can achieve conceptual clustering of large data set in sublinear time.

Key words: machine learning, conceptual clustering, conjunctive clustering, bipartite graph, biclique

中图分类号: