一种改进的主题网络蜘蛛搜索算法
计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (10): 174-176.
• 数据库与信息处理 • 上一篇 下一篇
林海霞 原福永 陈金森 刘俊峰
收稿日期:
修回日期:
出版日期:
发布日期:
通讯作者:
Received:
Revised:
Online:
Published:
摘要: 主题网络蜘蛛搜索策略是专业搜索引擎的核心技术。但是目前的主题搜索算法往往存在很大贪婪性,难以在全局范围内找到最优解。通过比较分析发现Best-First算法虽然有它的不足,但是它在几种算法中表现的性能最优。故以Best-First算法为基础,提出了BS-BS算法。对BS-BS算法进行性能评价,发现应用此算法搜索不但“召回率”有所提高,还能在一定程度上找到全局范围内的最优解。
关键词: 主题网络蜘蛛, Best-First算法, 召回率
Abstract: Topic web crawler search strategy is the core of professional search engine technology. However, the current topic search algorithms always exist large greedy It is difficult to find optimal solutions in the overall situation. Through comparative analysis found that despite Best-First algorithm having shortcomings, but its performance is optimal in several algorithms So based on Best-First algorithms it raised BS-BS algorithms. Then it evaluated BS-BS algorithm .And found that not only "recall rate" had improved, but could get the optimal solutions in the overall situation.
Key words: topic web crawler, Best-First algorithm, recall ratio
林海霞 原福永 陈金森 刘俊峰.
0 / 推荐
导出引用管理器 EndNote|Ris|BibTeX
链接本文: http://cea.ceaj.org/CN/
http://cea.ceaj.org/CN/Y2007/V43/I10/174