计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (10): 144-146.
• 数据库、信号与信息处理 • 上一篇 下一篇
陈 平,刘晓霞,李亚军
收稿日期:
修回日期:
出版日期:
发布日期:
通讯作者:
CHEN Ping,LIU Xiao-xia,LI Ya-jun
Received:
Revised:
Online:
Published:
Contact:
摘要: 提出了一种基于字典与统计相结合的中文分词方法,该方法利用改进的字典结构能够快速切分,在其基础上进一步利用统计的方法处理所产生未登录词,并且能解决大部分交集歧义问题。
关键词: 基于字典的分词, 基于统计的分词, 交叉歧义, 未登录词
Abstract: Proposes a method based on dictionary and statistics.The method uses the changed dictionary structure that is able improve efficiency,then uses statistics to deal with the unregistered words left over in the first step,also can resolve most ambiguity.
Key words: word segmentation based on dictionary, word segmentation based on statistical method, crossing ambiguities, unregistered
陈 平,刘晓霞,李亚军. 基于字典和统计的分词方法[J]. 计算机工程与应用, 2008, 44(10): 144-146.
CHEN Ping,LIU Xiao-xia,LI Ya-jun. Chinese word segmentation based on dictionary and statistics[J]. Computer Engineering and Applications, 2008, 44(10): 144-146.
0 / 推荐
导出引用管理器 EndNote|Ris|BibTeX
链接本文: http://cea.ceaj.org/CN/
http://cea.ceaj.org/CN/Y2008/V44/I10/144