Computer Engineering and Applications ›› 2012, Vol. 48 ›› Issue (3): 129-132.

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Query recommendation algorithm based on text clustering search engine

YUAN Jinsheng, CHENG Chaoran   

  1. School of Information, Beijing Forestry University, Beijing 100083, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2012-01-21 Published:2012-01-21

基于文本聚类搜索引擎的查询扩展算法

袁津生,程超然   

  1. 北京林业大学 信息学院,北京 100083

Abstract: Most of the researches on search engine based on text clustering doesn’t provide a good solution for deep searching with small clusters. To solve this kind of problems, a query recommendation algorithm based on clustering is proposed. This algorithm improves the similarity formula utilizing the hierarchical clustering results generated by text clustering, then searches for the target clusters using the extracted key-words, processes the result set using K-median clustering algorithm for recommendation. All the processes are done offline to avoid online computing. The algorithm is proved effective by experiment.

Key words: K-median, key-words extraction, similarity formula, query recommendation

摘要: 目前多数基于文本聚类搜索引擎的研究对于聚类产生的小聚类簇查询未能给出深入查询解决方案,针对此类问题提出了一种基于聚类的查询扩展算法。此算法利用簇关系树结构改进相似度公式,对目标簇提取主题词并进行二次查询后,通过K中值聚类算法对查询结果进行聚类以对其进行扩展。此算法全部过程均为离线运算,旨在避免在线运算影响查询响应效率,并通过实验验证了该算法的有效性。

关键词: K中值聚类, 主题词提取, 相似度计算, 查询扩展