计算机工程与应用 ›› 2012, Vol. 48 ›› Issue (1): 153-156.

• 数据库、信号与信息处理 • 上一篇    下一篇

界模型信息检索及其参数优化

王 彪1,2,高光来1   

  1. 1.内蒙古大学 计算机学院,呼和浩特 010021
    2.内蒙古财经学院 计算机信息管理学院,呼和浩特 010070
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2012-01-01 发布日期:2012-01-01

Bound model of information retrieval and its parameter optimization

WANG Biao1,2, GAO Guanglai1   

  1. 1.School of Computer Science, Inner Mongolia University, Hohhot 010021, China
    2.School of Computer Information Management, Inner Mongolia Finance and Economics College, Hohhot 010070, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2012-01-01 Published:2012-01-01

摘要: 信息检索中,如何较好地理解和表达用户的信息需求是提高信息检索效果的关键。从语言的内涵和外延出发,挖掘、计算信息需求的上边界、下边界,确定信息需求的需求域,建立了一种表达用户信息需求的界模型。引入文档与信息需求域的相似度,在信息检索时计算各文档的相似度,并根据相似度对文档进行排序。使用Lemur工具进行的对比分析实验表明,界模型具有较理想的检索效果。进一步对相似度中的参数进行了优化,得到了更优的检索效果。

关键词: 信息需求域, 内涵, 外延, 界模型, 信息检索, 参数优化

Abstract: For information retrieval system, how to understand accurately and express user’s information need is the key to improve information retrieval results. This paper analyzes natural language’s connotation and denotation, calculates the upper bound, lower bound of information need, determines the information need domain and introduces an expression of user information need called Bound model. In information retrieval, similarity between document and the information need domain is defined and calculated, and documents are sorted according to the similarity. Experiments done using Lemur tools show that the Bound model has good retrieval results. This paper optimizes parameter of the similarity in Bound model further and gets better retrieval results.

Key words: information need domain, connotation, denotation, Bound model, information retrieval, parameter optimization