计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (7): 133-138.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

基于特征选择与谱聚类的视觉词典构建算法

王  鑫,李  璐   

  1. 安徽建筑大学 数理系,合肥 230601
  • 出版日期:2014-04-01 发布日期:2014-04-25

Feature selection and spectral clustering method for visual vocabulary generation

WANG Xin, LI Lu   

  1. Department of Mathematics and Physics, Anhui Jianzhu University, Hefei 230601, China
  • Online:2014-04-01 Published:2014-04-25

摘要: 传统的视觉词典一般通过K-means聚类生成,一方面这种无监督的学习没有充分利用类别的先验信息,另一方面由于K-means算法自身的局限性导致生成的视觉词典性能较差。针对上述问题,提出一种基于谱聚类构建视觉词典的算法,根据训练样本的类别信息进行分割并采用动态互信息的度量方式进行特征选择,在特征空间中进行谱聚类并生成最终的视觉词典。该方法充分利用了样本的类别信息和谱聚类的优点,有效地解决了图像数据特征空间的高维性和结构复杂性所带来的问题;在Scene-15数据集上的实验结果验证了算法的有效性。

关键词: 场景识别, 视觉词典, K-means聚类, 谱聚类, 互信息

Abstract: Generally,the K-means clustering method is applied to generate visual dictionary. However, on the one hand this unsupervised learning does not make use of the priori information of category. On the other hand, the own limitations of K-means clustering result in poor performance of visual dictionary. Aiming at this problem, this paper presents a new visual dictionary construction algorithm based on spectral clustering. The training samples are divided according to the category information firstly and carry out feature selecting using dynamic mutual information. And then it generates the final visual dictionary by spectral clustering in the feature space. This method not only takes advantage of the category information but also the advantages of spectral clustering fully and effectively solves the problems caused by high dimensionality and structural complexity of feature space. The experiments on Scene-15 database prove the effectiveness of the proposed method.

Key words: scene recognition, visual dictionary, K-means clustering, spectral clustering, mutual information