计算机工程与应用 ›› 2012, Vol. 48 ›› Issue (1): 142-144.

• 数据库、信号与信息处理 • 上一篇    下一篇

引入信息增益的层次聚类算法

刘一鸣,张化祥   

  1. 山东师范大学 信息科学与工程学院,济南 250014
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2012-01-01 发布日期:2012-01-01

New hierarchical clustering method using information gain

LIU Yiming, ZHANG Huaxiang   

  1. School of Information Science and Engineering, Shandong Normal University, Jinan 250014, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2012-01-01 Published:2012-01-01

摘要: 层次聚类分析是模式识别和数据挖掘领域中一个非常重要的研究课题,具有广泛的应用前景。受决策树学习中选择最佳分类属性的启发,提出一种引入信息增益的层次聚类方法,该方法利用信息增益指导层次聚类中的属性加权,从而提高聚类结果质量。在UCI数据集上的实验结果表明,该算法性能明显优于原层次聚类算法。

关键词: 层次聚类, 信息增益, 属性加权

Abstract: Hierarchical clustering analysis is a very important subject in the fields of pattern recognition and data mining, and has a broad application prospect. Inspired by the idea of selecting the best classification attributes in decision tree algorithm, a novel hierarchical clustering algorithm using information gain is proposed. This algorithm directs the attribute weighting in a hierarchical clustering by computing the information gains, thereby improving the quality of clustering results. The experiment results on UCI machine learning data sets indicate that it yields better stability compared with the quondam hierarchical clustering algorithm.

Key words: hierarchical clustering, information gain, attribute weighting