Computer Engineering and Applications ›› 2007, Vol. 43 ›› Issue (21): 180-183.

Structured document retrieval model based on Bayesian network

ZHAO Shuang1,XU Jian-min2   

  1. 1.College of Economics and Management,Hebei Polytechnic University,Tangshan,Hebei 063000,China
    2.Computer Department of Mathematics and Computer College,Hebei University,Baoding,Hebei 071002,China
赵 爽1,徐建民2   

  1. 1.河北理工大学 经管学院,河北 唐山 063000
    2.河北大学 数学与计算机学院,河北 保定 071002
Abstract: Using the term relationships can improve the performance of the Information retrieval system.This paper adopts the method based on co-occurrence analysis to learn the term relationships,applies these relationships to the structured document retrieval,and presents a structured document retrieval model based on Bayesian network,gives the topology,probability estimation and the inference process of this model.Experiment results show that the retrieval performance of this model is better than the model that does not consider the term relationship.

Key words: Bayesian network, structured document, structured document retrieval, co-occurrence analysis

摘要: 研究表明合理考虑术语之间的关系可以提高检索系统的性能。采用共现分析的方法从文档集合中学习得到术语之间的关系,并应用到结构化文档检索中,提出了一个基于贝叶斯网络的结构化文档检索模型,给出了其拓扑结构、概率估计以及推理过程。实验表明该模型的检索性能要优于没有考虑术语之间关系的模型。

关键词: 贝叶斯网络, 结构化文档, 结构化文档检索, 共现分析