计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (28): 17-20.

• 博士论坛 • 上一篇    下一篇

树库中双词搭配的自动获取和识别研究

徐润华1,冯敏萱2,陈小荷3   

  1. 南京师范大学 文学院,南京 210097
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-10-01 发布日期:2011-10-01

Automatic acquisition?and?recognition of two word-collocation in treebank

XU Runhua1,FENG Minxuan2,CHEN Xiaohe3   

  1. College of Liberal Arts,Nanjing Normal University,Nanjing 210097,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-10-01 Published:2011-10-01

摘要: 大规模语料中的搭配自动获取和识别技术是自然语言处理领域的基础性工作之一。句子中的搭配和句法结构密切相关,从句法对搭配进行约束的角度,分别提出了一种保留结构中心词的搭配获取方法和一种添加了句法规则约束的搭配识别方法。实验结果表明,保留结构中心词的搭配获取方法能够较为有效地从树库中抽取搭配;添加了句法规则约束的搭配识别方法较之简单查表的搭配识别方法有10%~15%的效果提升。

关键词: 搭配获取, 搭配识别, 句法结构, 规则约束

Abstract: Automatic acquisition and recognition of collocation is one of the basic work in natural language processing.Considering with the affects by sentence structure,this paper proposes a collocation acquisition method by reserving the headwords and a collocation recognition method adding with the syntax restrictions.The result shows that the collocation acquisition method runs effectively,and the collocation recognition method has the effect of 10%~15% increase compared with the baseline.

Key words: collocation acquisition, collocation recognition, sentence structure, rule restriction