计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (14): 142-144.

• 数据库、信号与信息处理 • 上一篇    下一篇

一种基于GN算法的文本概念聚类新方法

安 娜,谢福鼎,张 永,刘绍海   

  1. 辽宁师范大学 计算机与信息技术学院, 辽宁 大连 116029
  • 收稿日期:2007-11-12 修回日期:2008-01-28 出版日期:2008-05-11 发布日期:2008-05-11
  • 通讯作者: 安 娜

New method for text concept clustering based on GN algorithm

AN Na,XIE Fu-ding,ZHANG Yong,LIU Shao-hai   

  1. Department of Computer Science,Liaoning Normal University,Dalian,Liaoning 116029,China
  • Received:2007-11-12 Revised:2008-01-28 Online:2008-05-11 Published:2008-05-11
  • Contact: AN Na

摘要: 文本聚类是当前文本信息挖掘的基础和研究的重点。给出一种新的文本聚类方法,它将概念格和复杂网络有机地结合起来,以达到更优的聚类效果。首先计算关键词特征权值并对特征向量进行降维处理,然后根据关键词权值大小映射到形式背景中,通过本文所给出的新的相似度公式,计算出形式背景中概念相似度的大小,从而构造GN网络并应用GN算法进行文本概念聚类。最后通过实例,验证了方法的可行性。

关键词: 复杂网络, GN算法, 文本聚类, 概念格

Abstract: Text clustering is a basic and important topic in text mining.This paper presents a new text clustering method which takes the advantages of concept lattice and complex network.The algorithm firstly computes the weights of the key words and processes the problem of decreasing dimension,and then the formal context is constructed in terms of key words which have the proper weight.Secondly,the similarities between concepts are computed by using of the formula proposed in this paper.Text concept clustering can be done by the construction of GN network and application of GN algorithms.At last,the experiment shows the validity of this method.

Key words: complex networks, GN algorithm, text clustering, concept lattices