计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (3): 175-177.

• 数据库与信息处理 • 上一篇    下一篇

常数级压缩技术中随机分段文件频率表的研究

陆 军1,2,刘大昕1,高 扬2,谢新强2,张薇冉2,马彦斌2   

  1. 1.哈尔滨工程大学 计算机科学与技术学院,哈尔滨 150001
    2.黑龙江大学 计算机科学与技术学院,哈尔滨 150080
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-01-21 发布日期:2008-01-21
  • 通讯作者: 陆 军

Study on frequency table of stochastic subsection file in constant grade compression method

LU Jun1,2,LIU Da-xin1,GAO Yang2,XIE Xin-qiang2,ZHANG Wei-ran2,MA Yan-bin2   

  1. 1.College of Computer Science and Technology,Harbin Engineering University,Harbin 150001,China
    2.College of Computer Science and Technology,Heilongjiang University,Harbin 150080,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-01-21 Published:2008-01-21
  • Contact: LU Jun

摘要: 对随机文件中分段字节频率分布规律进行了研究,发现对于20 k分段字节出现的频率大多分布在长度为64的连续区域内,偏离点个数非常少,借助该规律可以对频率表进行压缩。此外采用01标识法对附加信息进行优化,减少了附加信息存储空间。频率表及附加信息的压缩存储对整个常数级压缩技术的实现具有重要意义。

关键词: 数据压缩, 排列组合, 频率, 常数级压缩

Abstract: The distributing rule of the subsection byte frequency in stochastic file is researched in this paper.The rule is that the most frequencies of 20 k subsection byte distribute in a continuous scale which length is 64.The number of departure nodes is little.This rule can be used to compress a frequency table.Further,01-sign method is used to optimize affixation information and the space of affixation information is reduced.The compression storage of the frequency table and the affixation information is of great significance for the whole realization of constant grade compression technology.

Key words: data compression, permutation and combination, frequency, constant grade compression