Computer Engineering and Applications ›› 2012, Vol. 48 ›› Issue (1): 142-144.

• 数据库、信号与信息处理 • Previous Articles     Next Articles

New hierarchical clustering method using information gain

LIU Yiming, ZHANG Huaxiang   

  1. School of Information Science and Engineering, Shandong Normal University, Jinan 250014, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2012-01-01 Published:2012-01-01

引入信息增益的层次聚类算法

刘一鸣,张化祥   

  1. 山东师范大学 信息科学与工程学院,济南 250014

Abstract: Hierarchical clustering analysis is a very important subject in the fields of pattern recognition and data mining, and has a broad application prospect. Inspired by the idea of selecting the best classification attributes in decision tree algorithm, a novel hierarchical clustering algorithm using information gain is proposed. This algorithm directs the attribute weighting in a hierarchical clustering by computing the information gains, thereby improving the quality of clustering results. The experiment results on UCI machine learning data sets indicate that it yields better stability compared with the quondam hierarchical clustering algorithm.

Key words: hierarchical clustering, information gain, attribute weighting

摘要: 层次聚类分析是模式识别和数据挖掘领域中一个非常重要的研究课题,具有广泛的应用前景。受决策树学习中选择最佳分类属性的启发,提出一种引入信息增益的层次聚类方法,该方法利用信息增益指导层次聚类中的属性加权,从而提高聚类结果质量。在UCI数据集上的实验结果表明,该算法性能明显优于原层次聚类算法。

关键词: 层次聚类, 信息增益, 属性加权