计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (31): 149-152.DOI: 10.3778/j.issn.1002-8331.2008.31.043

• 数据库、信号与信息处理 • 上一篇    下一篇

一种高效的倒排索引存储结构

邓 攀,刘功申   

  1. 上海交通大学 信息安全工程学院,上海 200240
  • 收稿日期:2008-05-20 修回日期:2008-08-11 出版日期:2008-11-01 发布日期:2008-11-01
  • 通讯作者: 邓 攀

Effective storage structure of inverted index

DENG Pan,LIU Gong-shen   

  1. Department of Information Security,Shanghai Jiaotong University,Shanghai 200240,China
  • Received:2008-05-20 Revised:2008-08-11 Online:2008-11-01 Published:2008-11-01
  • Contact: DENG Pan

摘要: 倒排索引是信息检索系统的核心部分,其存储结构对检索的效率和效果起着至关重要的作用,根据汉语词汇的频率分布情况和当前的软硬件环境,提出一种高效的倒排索引结构,在一定程度上能够节省磁盘空间,提高检索效率,并且支持增量更新和删除。

关键词: 倒排索引, 词典, 容量, 追加块

Abstract: Inverted index is the core component of an information retrieval system,the storage structure of it plays a crucial role in effect and efficiency of retrieval.In this paper,according to the frequencies distribution of Chinese vocabulary and the current hardware and software environment,the authors introduce an effective storage structure of inverted index that can save the disk usage and improve the efficiency of retrieval,as well as supporting real time update and delete.

Key words: inverted index, dictionary, capacity, add-on block