Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (4): 227-228.DOI: 10.3778/j.issn.1002-8331.2009.04.066
• 工程与应用 • Previous Articles Next Articles
CAO Bo,SU Yi-dan,DENG Qi
Received:
Revised:
Online:
Published:
Contact:
曹 波,苏一丹,邓 琦
通讯作者:
Abstract: Authors use the maximum entropy model to recognize the Chinese name automatically.Firstly,authors replace the corpus’s poses with roles,then,use feature template to extract feature set from the corpus which poses have been replaced with roles,thirdly,train the parameters of the feature set using IIS algorithm,finally,use the viterbi algorithm to tag the text which has been roughly segmented.The possible names are recognized after maximum pattern matching on the roles sequence.The closed test shows that the precision,the recall and the F-measure reach 75.6%,91.4%,82.8%.
Key words: Chinese name recognition, maximum entropy model, viterbi algorithm
摘要: 用最大熵模型自动识别中国人名。首先对语料库的词性进行角色替换,然后用特征模板从角色替换后的语料库中提取出特征集,接着用IIS算法训练特征集的最大熵参数,最后用viterbi算法对初分词文本进行角色标注,并在角色序列的基础上进行模式最大匹配,从而实现中国人名的自动识别。在封闭测试实验中,识别准确率、召回率、F-值分别达到了85.4%、91.2%、88.2%。
关键词: 中国人名识别, 最大熵模型, viterbi算法
CAO Bo,SU Yi-dan,DENG Qi. Automatic recognition of Chinese name based on maximum entropy[J]. Computer Engineering and Applications, 2009, 45(4): 227-228.
曹 波,苏一丹,邓 琦. 基于最大熵模型的中国人名自动识别[J]. 计算机工程与应用, 2009, 45(4): 227-228.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2009.04.066
http://cea.ceaj.org/EN/Y2009/V45/I4/227