Computer Engineering and Applications ›› 2014, Vol. 50 ›› Issue (15): 113-116.

Previous Articles     Next Articles

Design and implementation of vertical search engine for education video resources

WEI Renjia, WU Zhenqiang   

  1. School of Computer Science, Shaanxi Normal University, Xi’an 710062, China
  • Online:2014-08-01 Published:2014-08-04

面向教育视频资源的垂直搜索引擎设计与实现

魏刃佳,吴振强   

  1. 陕西师范大学 计算机科学学院,西安 710062

Abstract: This paper combines with the question of the lower utilization rate of education resources in our country during the development of M-Learning project, and then integrates education video resources, designs and implements a vertical search engine through the extension of Heritrix and Lucene, which is relevant to the subject of education video resources. In addition, this paper proposes a combination of tightly coupled for Heritrix and Lucene in order to achieve process optimization and solve the problems of serial combination. The new combinational solution makes webpages crawling, web analysis and index building synchronously so as to reduce the cost of system input and output and the occupancy rate of disk. The experiment indicates that there is smaller difference between the combinational solution of tightly coupled and serial in the running efficiency of system. The result meets the need of practical application.

Key words: video search, Heritrix, Lucene, vertical search engine

摘要: 在移动学习项目的开发过程中,结合我国教育资源利用率低的问题,通过扩展Heritrix和Lucene,整合教育资源,设计并实现了面向教育视频资源的垂直搜索引擎。针对Heritrix与Lucene串行组合方案难以实现信息抓取、分析过程与索引过程同时进行的问题,提出一种紧耦合的流程优化组合方案,使网页抓取、网页内容分析筛选和建立索引同时进行,降低了系统IO开销和磁盘空间的占用率。实验测试表明,在Heritrix运行过程中嵌入索引建立操作,对系统的运行效率影响较小,满足实际应用的需要。

关键词: 视频搜索, Heritrix, Lucene, 垂直搜索引擎