计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (34): 174-176.
• 数据库与信息处理 • 上一篇 下一篇
姚全珠,杨增辉,张 楠,田 元
收稿日期:
修回日期:
出版日期:
发布日期:
通讯作者:
YAO Quan-zhu,YANG Zeng-hui,ZHANG Nan,TIAN Yuan
Received:
Revised:
Online:
Published:
Contact:
摘要: Hidden Web因为其隐蔽性而难以直接抓取,因此成为信息检索研究的一个新领域。提出了一种获取Hidden Web信息的方法,讨论了实现的关键技术。通过设计提出的启发式查询词选择算法,提高了抓取的效率。实验证明了该模型和算法的有效性。
关键词: 信息检索, Hidden Web, 爬虫, 启发式算法
Abstract: Because of the hidden feature,Hidden Web is hard to crawl.It becomes a new direction in the field of information retrieval.In this paper a new method of Hidden Web information retrieval is proposed.It presents a generic operational model of the Hidden Web information retrieval and describes the key techniques.It introduces a new heuristic query selection algorithm which designed by this paper.Based on this technique,the crawling is more efficient.Experiments show the effectiveness of both the model and the algorithm.
Key words: information retrieval, Hidden Web, crawler, heuristic algorithm
姚全珠,杨增辉,张 楠,田 元. 基于启发式查询词选择算法的Hidden Web获取研究[J]. 计算机工程与应用, 2007, 43(34): 174-176.
YAO Quan-zhu,YANG Zeng-hui,ZHANG Nan,TIAN Yuan. Research on crawling Hidden Web based on heuristic query selection algorithm[J]. Computer Engineering and Applications, 2007, 43(34): 174-176.
0 / 推荐
导出引用管理器 EndNote|Ris|BibTeX
链接本文: http://cea.ceaj.org/CN/
http://cea.ceaj.org/CN/Y2007/V43/I34/174