计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (6): 9-13.

• 博士论坛 • 上一篇    下一篇

汉英动词次范畴化对应关系自动获取

韩习武   

  1. 黑龙江大学 计算语言学研究所,哈尔滨 150080
  • 收稿日期:2007-10-10 修回日期:2007-11-27 出版日期:2008-02-21 发布日期:2008-02-21
  • 通讯作者: 韩习武

Acquisition for Chinese English subcategorization relations

HAN Xi-wu   

  1. Institute of Computational Linguistics,Heilongjiang University,Harbin 150080,China
  • Received:2007-10-10 Revised:2007-11-27 Online:2008-02-21 Published:2008-02-21
  • Contact: HAN Xi-wu

摘要: 动词次范畴化及其自动获取的研究已经在英、汉等很多语种里取得了较好的成果,但跨语言的次范畴化研究仍然很少,并且不成体系。描述了基于汉英双语语料库的统计分析并获取跨语言次范畴化对应关系的系统化实验。首先,根据双语词典和句法相似度识别谓词可能对齐的句对;然后,应用双重最大似然检验的统计过滤方法自动获取了654种次范畴化框架对应类型。实验结果分析表明,这些对应类型具备统计和句法意义上的协调性。

关键词: 动词次范畴化, 跨语言对应关系, 自动获取

Abstract: Research on verb subcategorization and its acquisition has achieved a lot for single languages,such as English and Chinese,whereas the cross-lingual subcategorization demands more systematic efforts.This paper describes a systematic experiment of statistical analysis and acquisition for bilingual subcategorization relations based on Chinese-English parallel corpus.First,sentence pairs with possible parallel predicates are extracted.Then,654 bilingual basic types of subcategorization frames are acquired by means of the two-fold MLE filtering method.Analysis on the results show that the acquired bilingual subcategorization frames are statistically and syntactically compatible.

Key words: verb subcategorization, cross-lingual relations, acquisition