计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (23): 157-159.DOI: 10.3778/j.issn.1002-8331.2008.23.048

• 数据库、信号与信息处理 • 上一篇    下一篇

结合关联规则的元搜索引擎结果聚类改进

王 琼,顾文轩,徐汀荣   

  1. 苏州大学 计算机科学与技术学院,江苏 苏州 215006
  • 收稿日期:2007-10-15 修回日期:2007-12-27 出版日期:2008-08-11 发布日期:2008-08-11
  • 通讯作者: 王 琼

Improvement on clustering of meta search engine combining with association rules

WANG Qiong,GU Wen-xuan,XU Ting-rong   

  1. School of Computer Science & Technology,Soochow University,Suzhou,Jiangsu 215006,China
  • Received:2007-10-15 Revised:2007-12-27 Online:2008-08-11 Published:2008-08-11
  • Contact: WANG Qiong

摘要: 将目的搜索引擎返回的结果经分词处理并提取主要关键词后,采用关联规则建立关联词矩阵,并利用FCM(Fuzzy C-Means,模糊C均值聚类)对结果进行聚类,且通过聚类有效性函数FPUc)判断最佳聚类结果,最终按照相关度大小顺序将结果返回。通过与K-means(K均值聚类)算法的实验对比发现,以上方法能有效地保证运行效率与聚类个数的有效性,且提高了相关结果的排序位置,因此更能满足用户的需求。

关键词: 元搜索引擎, 模糊C均值, 关联规则, 结果聚类, 关联词矩阵

Abstract: After making word segmentation on some results which are returned from component search engine and extracting the main keywords,then it uses association rules to build up an associated word matrix,and adopts the FCM(Fuzzy C-Means) algorithm to make fuzzy clustering for the results based on the matrix.Finally,it depends on the degree of relevance to return the results for users.The paper compares this method above with K-means(K-means clustering) algorithm through the experimentation and demonstrates that the method above can ensure operating efficiency and the effectiveness of clustering,further more,it improves the ranking position of the related results,so can well satisfy the requirements of the users.

Key words: meta search engine, Fuzzy C-Means(FCM), association rules, results clustering, associated word matrix