计算机工程与应用 ›› 2010, Vol. 46 ›› Issue (11): 131-134.DOI: 10.3778/j.issn.1002-8331.2010.11.040

• 数据库、信号与信息处理 • 上一篇    下一篇


李露平1,2,王秋月1,王 珊2   

  1. 1.中国人民大学 信息学院,北京 100872
    2.教育部数据工程与知识工程重点实验室,北京 100872
  • 收稿日期:2008-10-10 修回日期:2009-12-10 出版日期:2010-04-11 发布日期:2010-04-11
  • 通讯作者: 李露平

Feedback algorithm of XML retrieval on element level

LI Lu-ping1,2,WANG Qiu-yue1,WANG Shan2   

  1. 1.School of Information,Renmin University of China,Beijing 100872,China
    2.Key Laboratory of Data Engineering and Knowledge Engineering,Ministry of Education,Beijing 100872,China
  • Received:2008-10-10 Revised:2009-12-10 Online:2010-04-11 Published:2010-04-11
  • Contact: LI Lu-ping

摘要: XML作为网上数据表示和交换的标准具有日益广泛的应用。近年来,XML元素级检索得到越来越多信息检索领域研究者的关注。如何提高XML元素级检索效果已经成为一个重要的研究课题。在LEMUR系统里实现了一种针对XML元素级检索的新反馈算法,大幅度地提高了检索结果的精度。利用INEX提供的XML文档集、评测系统等进行了长期的实验。实验数据显示,该算法以内容作为反馈信息使系统的平均精度提高了15.70%,以内容和结构作为反馈信息使系统的平均精度提高了18.19%。

关键词: 可扩展标记语言检索, 相关性反馈, 相关元素, 高频词集

Abstract: As the de-facto standard for data representation and exchange on the Web,XML is being widely used in many applications.Recent trends in IR research demonstrate the growing interest in XML retrieval on element level.Many open issues appear when considering the effectiveness of XML retrieval on element.A new feedback algorithm is implemented in LEMUR system to improve the effectiveness of XML retrieval on element.The performance of the new feedback algorithm is satisfactory.The data of the long-term experiment is provided by INEX.Experimental results demonstrate that the precision of retrieval results on element level is increased 15.70% when adding content information only and 18.19% when adding both content and structure information.

Key words: Extensible Markup Language(XML) retrieval, relevance feedback, relevance element, HF words collection
