Research on crawling Hidden Web based on heuristic query selection algorithm

Computer Engineering and Applications ›› 2007, Vol. 43 ›› Issue (34): 174-176.

• 数据库与信息处理 • Previous Articles Next Articles

Research on crawling Hidden Web based on heuristic query selection algorithm

YAO Quan-zhu,YANG Zeng-hui,ZHANG Nan,TIAN Yuan

School of Computer Science，Xi’an University of Technology，Xi’an 710048，China

Received:1900-01-01 Revised:1900-01-01 Online:2007-12-01 Published:2007-12-01
Contact: YAO Quan-zhu

基于启发式查询词选择算法的Hidden Web获取研究

姚全珠,杨增辉,张楠,田元

西安理工大学计算机学院，西安 710048

通讯作者: 姚全珠

Abstract

Abstract: Because of the hidden feature，Hidden Web is hard to crawl.It becomes a new direction in the field of information retrieval.In this paper a new method of Hidden Web information retrieval is proposed.It presents a generic operational model of the Hidden Web information retrieval and describes the key techniques.It introduces a new heuristic query selection algorithm which designed by this paper.Based on this technique，the crawling is more efficient.Experiments show the effectiveness of both the model and the algorithm.

Key words: information retrieval, Hidden Web, crawler, heuristic algorithm

摘要： Hidden Web因为其隐蔽性而难以直接抓取，因此成为信息检索研究的一个新领域。提出了一种获取Hidden Web信息的方法，讨论了实现的关键技术。通过设计提出的启发式查询词选择算法，提高了抓取的效率。实验证明了该模型和算法的有效性。

关键词: 信息检索, Hidden Web, 爬虫, 启发式算法

YAO Quan-zhu,YANG Zeng-hui,ZHANG Nan,TIAN Yuan. Research on crawling Hidden Web based on heuristic query selection algorithm[J]. Computer Engineering and Applications, 2007, 43(34): 174-176.

姚全珠,杨增辉,张楠,田元. 基于启发式查询词选择算法的Hidden Web获取研究[J]. 计算机工程与应用, 2007, 43(34): 174-176.

[1]	ZHANG Chengling, LI Jinjin, LIN Yidong. Attribute Reduction in Formal Contexts Based on OE-Concept Lattices [J]. Computer Engineering and Applications, 2021, 57(15): 82-89.
[2]	MENG Xin, YANG Qin, HAO Tingting, ZHANG Jie, CAO Cejun. Optimized Combination of Picking Path in Different Distributions and Algorithms [J]. Computer Engineering and Applications, 2020, 56(23): 229-236.
[3]	LI Yali, WANG Shuqin, CHEN Qianru, WANG Xiaogang. Comparative Study of Several New Swarm Intelligence Optimization Algorithms [J]. Computer Engineering and Applications, 2020, 56(22): 1-12.
[4]	YI Chengqi, GUO Xin, TONG Nannan, DOU Yue, CHEN Dong, WANG Jiandong. Innovation Situation Analysis Algorithm Based on Heuristic Model of Community Detection [J]. Computer Engineering and Applications, 2020, 56(15): 74-79.
[5]	HU Xiaomin, LIANG Tianyi, WANG Mingfeng, LI Min. Novel Tree Heuristic Search Algorithm for Robot Path Planning [J]. Computer Engineering and Applications, 2020, 56(11): 164-171.
[6]	CHI Zongzheng, DONG Shaozheng, GUO Tong, REN Zhilei, ZHOU Kuanjiu, GUO He. Research on Wind Farm Layout Based on Hyper-Heuristic [J]. Computer Engineering and Applications, 2019, 55(7): 220-225.
[7]	SHANG Zhengyang1, GU Jinan2, TANG Shixi2, SUN Xiaohong2. Efficient Residual-Space-Optimization Algorithm for Three Dimensional Container Loading Problem [J]. Computer Engineering and Applications, 2019, 55(5): 44-50.
[8]	ZHANG Jun, HE Ketai. Study on Hybrid Genetic and Simulated Annealing Algorithm for Three-Dimensional Packing Problems [J]. Computer Engineering and Applications, 2019, 55(14): 32-39.
[9]	WANG Kui1, FEI Chenjie1, LIU Baisong1，2. Convolutional Neural Network Themed Reptile Research Based on LDA [J]. Computer Engineering and Applications, 2019, 55(11): 123-128.
[10]	ZHANG Ruifang1, GUO Kehua1，2. Novel retrieval intention modeling method for personalized website [J]. Computer Engineering and Applications, 2018, 54(6): 37-43.
[11]	WANG Changbao1, YANG Xibei1，2, DOU Huili1, CHEN Xiangjian1, WANG Pingxin3. Research on local attribute reduction approach via neighborhood decision error rate [J]. Computer Engineering and Applications, 2018, 54(6): 95-99.
[12]	WANG Zhuan, PEI Zeping. Order batching algorithm based on heuristic picking route for saving mileage [J]. Computer Engineering and Applications, 2018, 54(23): 203-209.
[13]	XU Xuesong1, XU Xinyao2. Simultaneous regression algorithm for multi-structure model fitting [J]. Computer Engineering and Applications, 2017, 53(6): 73-79.
[14]	SHEN Xiajiong1, 2, YE Manman2, GAN Tian2, HAN Daojun1, 2. Information retrieval based on concept lattice and its tree visualization [J]. Computer Engineering and Applications, 2017, 53(3): 95-99.
[15]	JIA He1, AI Zhongliang1，2, JIA Gaofeng2, LIU Zhonglin1，2, CHEN Boxiong2. Research and realization on judicial large data retrieval model [J]. Computer Engineering and Applications, 2017, 53(20): 249-253.

Research on crawling Hidden Web based on heuristic query selection algorithm

基于启发式查询词选择算法的Hidden Web获取研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics