计算机工程与应用 ›› 2009, Vol. 45 ›› Issue (3): 135-137.DOI: 10.3778/j.issn.1002-8331.2009.03.040

• 数据库、信号与信息处理 • 上一篇    下一篇

基于最大熵方法进行动词搭配的自动标注

白妙青1,郑家恒2   

  1. 1.山西大学 计算中心,太原 030006
    2.山西大学 计算机系,太原 030006
  • 收稿日期:2008-07-09 修回日期:2008-11-03 出版日期:2009-01-21 发布日期:2009-01-21
  • 通讯作者: 白妙青

Autolabeling of Chinese verb-verb collocation based on maximum entropy principle

BAI Miao-qing1,ZHENG Jia-heng2   

  1. 1.Computer Center,Shanxi University,Taiyuan 030006,China
    2.Department of Computer,Shanxi University,Taiyuan 030006,China
  • Received:2008-07-09 Revised:2008-11-03 Online:2009-01-21 Published:2009-01-21
  • Contact: BAI Miao-qing

摘要: 搭配是汉语自动句法分析的重要知识源,而动词是句法分析的核心和前提。通过对已标注真实文本的分析,构造了动词搭配对的上下文变量信息特征模板,给出利用最大熵方法抽取动词—动词搭配,对待测的1 000句汉语句子应用最大熵方法自动识别出搭配,其中封闭测试抽取正确率为85.6%,召回率达到70.6%。

Abstract: Collocation plays an important role in parsing and verb is the kernel and precondition for Chinese parsing.This paper presents a method for verb—verb collocation based on maximum entropy principle,using the constructed characteristic modeling for context variable information via analyzing the real text labeled.By testing of 1 000 sentences with the maximum entropy principle,it has obtained 85.6% accuracy and 70.6% recall ratio.