计算机工程与应用 ›› 2012, Vol. 48 ›› Issue (5): 23-28.

• 博士论坛 • 上一篇    下一篇

一种基于负补偿自由能量方程的聚类算法

桂 冰,高 俊   

  1. 上海应用技术学院 计算机科学与信息工程学院,上海 201418
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2012-02-11 发布日期:2012-02-11

Penalized clustering algorithm based on negative free energy function

GUI Bing, GAO Jun   

  1. School of Computer Science & Information Engineering, Shanghai Institute of Technology, Shanghai 201418, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2012-02-11 Published:2012-02-11

摘要: 基于有限混合模型的聚类算法具有以下缺陷:聚类结果依赖于模型的初始化参数;聚类结果容易收敛于局部最优;聚类过程无法决策聚类数量。为了解决这些问题,提出了一种基于负补偿函数的自由能量方程,对此方程的训练会产生类似于模拟退火的效应,增大了获得全局最优聚类的可能性。提出了一种基于补偿函数的泛化模型选择方法以用于聚类数量决策。提出了一种基于聚类重叠度的动态控制法以用于权衡退火效应以及聚类数量的决策。实验结果表明,新算法的聚类性能明显优于其他传统算法。

关键词: 退火算法, 聚类算法, 期望最大化, 有限混合模型, 自由能量方程, 模型选择

Abstract: Drawbacks of finite-mixture-model-based clustering algorithms are as follows:the dependence on the choices of initial parameters;the convergence to a local optimum;the incapability of cluster number determination. To tackle these problems, a new model is developed by introducing a penalty function into the negative free energy function. The learning of this model presents a similar annealing process as that of the simulated-annealing algorithm, which may increase the chance of global optimization. A generalized model selection method is introduced for cluster-number determination. To avoid either a sudden temperature decay in an annealing process or an over-or under-estimation in the number of clusters, the weight of penalty function is adaptively controlled by a novel method based on the overlapping degree between clusters. Experimental results show that the proposed approach outperforms the conventional algorithms.

Key words: annealing algorithm, clustering, expectation maximization, finite mixture model, free energy function, model selection