计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (1): 1-1.

• 博士论坛 •    下一篇

汉语分词词典设计

翟伟斌,周振柳,蒋卓明,许榕生   

  1. 中科院高能物理研究所计算中心
  • 收稿日期:2006-08-01 修回日期:1900-01-01 出版日期:2007-01-01 发布日期:2007-01-01
  • 通讯作者: 翟伟斌 zhaiwb

Design Dictionary of Chinese Word Segmentation

,,,   

  1. 中科院高能物理研究所计算中心
  • Received:2006-08-01 Revised:1900-01-01 Online:2007-01-01 Published:2007-01-01

摘要: 汉语分词词典是中文信息处理系统的重要基础,词典算法设计的优劣直接关系着分词的速度和效率。本文采用动态TRIE索引树的词典机制,设计并实现了汉语分词词典,有效的减少了词典空间。实验结果表明该词典具有较高的查询性能。

关键词: 汉语分词, 词典查询, 中文信息处理

Abstract: Chinese word segmentation dictionary is the important base of Chinese information processing system. The arithmetic of the dictionary influences the speed and efficiency of segmentation. In this paper, the dictionary mechanism is dynamic TRIE tree, and we have designed the Chinese word segmentation dictionary. The dictionary uses less memory. The experiment shows that the dictionary has high efficiency.

Key words: Chinese word segmentation, Search dictionary, Chinese information processing