计算机工程与应用 ›› 2013, Vol. 49 ›› Issue (21): 42-45.

• 理论研究、研发设计 • 上一篇    下一篇

基于Web链接分析的HITS算法研究与改进

喻金平1,朱桂祥2 ,梅宏标3   

  1. 1.江西理工大学 工程研究院,江西 赣州 341000
    2.江西理工大学 信息工程学院,江西 赣州 341000
    3.江西理工大学 应用科学学院,江西 赣州 341000
  • 出版日期:2013-11-01 发布日期:2013-10-30

Research and improvement of HITS algorithm based on Web link analysis

YU Jinping1, ZHU Guixiang2, MEI Hongbiao3   

  1. 1.Engineering Research Institute, Jiangxi University of Science and Technology, Ganzhou, Jiangxi 341000, China
    2.School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, Jiangxi 341000, China
    3.College of Applied Science, Jiangxi University of Science and Technology, Ganzhou, Jiangxi 341000, China
  • Online:2013-11-01 Published:2013-10-30

摘要: 垂直搜索引擎的主题搜索策略有基于内容评价的搜索策略和基于Web链接分析的搜索策略,其中HITS算法是一种经典的基于Web链接分析的搜索策略,其主要的缺点是容易发生主题漂移。为了最大程度地避免主题漂移,提出了一种结合网页文本分析和扩散速率改进的F-HITS算法。实验结果表明,这些改进不仅节省了系统的开销,并且提高了页面搜索的准确率。

关键词: 垂直搜索, 搜索策略, 扩散速率, 文本分析, 超链接分析主题搜索(HITS)

Abstract: Vertical search engines have two kinds of subject search strategy, one is based on content evaluation, the other is based on Web link analysis, and HITS algorithm is a classical search strategy that is based on Web link analysis. Its significant drawback is easy to engender topic drift. In order to avoid engendering topic drift in the maximal degree, this paper puts forward a modified F-HITS algorithm that combines Web’s text analysis with diffusion rate. Experiment’s results show that those improvements not only can decrease system spending but also raise the accuracy of Web page searching.

Key words: vertical search, search strategy, diffusion rate, text analysis, Hyperlink-Induced Topic Search(HITS)