计算机工程与应用 ›› 2017, Vol. 53 ›› Issue (16): 134-137.DOI: 10.3778/j.issn.1002-8331.1701-0261

• 模式识别与人工智能 • 上一篇    下一篇

结合最近邻与闭模式子空间聚类方法

宋奎勇1,2,王念滨1,王红滨1,寇香霞2   

  1. 1.哈尔滨工程大学 计算机科学与技术学院,哈尔滨 150000
    2.呼伦贝尔职业技术学院 信息工程系,内蒙古 呼伦贝尔 021000
  • 出版日期:2017-08-15 发布日期:2017-08-31

Nearest neighbor and closed pattern subspace clustering

SONG Kuiyong1,2, WANG Nianbin1, WANG Hongbin1, KOU Xiangxia2   

  1. 1.College of Computer Science and Technology, Harbin Engineering University, Harbin 150000, China
    2.Department of Information Engineering, Hulunbuir Vocational Technical College, Hulunbuir, Inner Mongolia 021000, China
  • Online:2017-08-15 Published:2017-08-31

摘要: 针对传统距离度量在高维数据上效果不明显问题,提出一种共享最近邻子空间聚类算法(SNN_SC),按照维把数据集转变为多个最近邻事务数据库,挖掘事务数据库中最大共现对象集,即一维上聚类。在一维聚类集上进一步挖掘闭频繁项集,包含闭频繁项集的维是子空间,闭频繁项集是子空间上聚类。实验对比结果表明,SNN_SC能够更准确定位子空间,并在子空间上产生完整聚类。

关键词: 高维, 共享最近邻, 子空间聚类, 闭频繁项集

Abstract: According to the measurement results in high dimensional data is not obvious problems of the traditional distance, proposes a shared nearest neighbor subspace clustering algorithm (SNN_SC), according to the dimension of the data set into multiple nearest neighbor transaction database mining in transaction database maximum co-occurrence object set, namely dimension clustering. On the one dimensional clustering set, the closed frequent itemsets are further exploited. The dimension of the closed frequent itemsets is a subspace. The experimental results show that SNN_SC can more accurately locate the subspace, and generate a complete clustering in subspace.

Key words: high dimensional, shared nearest neighbor, subspaceclustering, closed frequent items