Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (3): 135-137.DOI: 10.3778/j.issn.1002-8331.2009.03.040

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Autolabeling of Chinese verb-verb collocation based on maximum entropy principle

BAI Miao-qing1,ZHENG Jia-heng2   

  1. 1.Computer Center,Shanxi University,Taiyuan 030006,China
    2.Department of Computer,Shanxi University,Taiyuan 030006,China
  • Received:2008-07-09 Revised:2008-11-03 Online:2009-01-21 Published:2009-01-21
  • Contact: BAI Miao-qing

基于最大熵方法进行动词搭配的自动标注

白妙青1,郑家恒2   

  1. 1.山西大学 计算中心,太原 030006
    2.山西大学 计算机系,太原 030006
  • 通讯作者: 白妙青

Abstract: Collocation plays an important role in parsing and verb is the kernel and precondition for Chinese parsing.This paper presents a method for verb—verb collocation based on maximum entropy principle,using the constructed characteristic modeling for context variable information via analyzing the real text labeled.By testing of 1 000 sentences with the maximum entropy principle,it has obtained 85.6% accuracy and 70.6% recall ratio.

摘要: 搭配是汉语自动句法分析的重要知识源,而动词是句法分析的核心和前提。通过对已标注真实文本的分析,构造了动词搭配对的上下文变量信息特征模板,给出利用最大熵方法抽取动词—动词搭配,对待测的1 000句汉语句子应用最大熵方法自动识别出搭配,其中封闭测试抽取正确率为85.6%,召回率达到70.6%。