计算机工程与应用 ›› 2009, Vol. 45 ›› Issue (33): 135-137.DOI: 10.3778/j.issn.1002-8331.2009.33.044

• 数据库、信号与信息处理 • 上一篇    下一篇

模糊策略下的搜索文本聚类分析技术

万红新1,彭 云2   

  1. 1.江西科技师范学院 数学与计算机科学学院,南昌 330013
    2.江西师范大学 计算机信息工程学院,南昌 330022
  • 收稿日期:2008-06-30 修回日期:2008-09-12 出版日期:2009-11-21 发布日期:2009-11-21
  • 通讯作者: 万红新

Technique of searching text clustering analysis based on fuzzy set

WAN Hong-xin1,PENG Yun2   

  1. 1.Mathematics & Computer Science College,Jiangxi Science & Technology Normal University,Nanchang 330013,China
    2.College of Computer Information and Engineering,Jiangxi Normal University,Nanchang 330022,China
  • Received:2008-06-30 Revised:2008-09-12 Online:2009-11-21 Published:2009-11-21
  • Contact: WAN Hong-xin

摘要: 在现有的搜索文本中,存在大量的不确定文本结构和内容,使得常规的聚类算法难以实现,并且文本搜索的结果没有进行类聚,造成搜索结果集合数据量非常庞大。提出了基于模糊集的文本搜索的聚类分析的方法,通过模糊技术对异构数据进行处理,可以改善算法实现的时间和空间的复杂度,减少文本处理的维度,提高算法的鲁棒性,对算法的实现给出了实例分析。通过与其他聚类算法的实测数据的比对分析,验证了算法实现的精确性和效率性。

关键词: 聚类分析, 文本挖掘, 模糊集, 隶属函数

Abstract: There are a large number of non-certain and non-structure contents in the web searching text.It is difficult to cluster the searching text by some normal classification methods.Because the searching text is not clustered,the searching result of the text is very enormous.A technique of searching text clustering analysis based on fuzzy set is proposed,and the algorithm has been described in detail by example.It can improve the algorithm complexity of time and space,decrease the dimensions of the algorithm,which should increase the robustness of the algorithm.To check the accuracy and efficiency of the algorithm,the comparative analysis of the sample and test data is provided.

Key words: clustering analysis, text mining, fuzzy set, membership function

中图分类号: