计算机工程与应用 ›› 2015, Vol. 51 ›› Issue (19): 129-137.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

“把”字句的自动释义与句式变换研究

王璐璐1,孙薇薇2,袁毓林2   

  1. 1.中国传媒大学,北京 100024
    2.北京大学,北京 100871
  • 出版日期:2015-09-30 发布日期:2015-10-13

On automatic interpretation and pattern alternation of the Chinese bǎ-construction

WANG Lulu1, SUN Weiwei2, YUAN Yulin2   

  1. 1.Communication University of China, Beijing 100024, China
    2.Peking University, Beijing 100871, China
  • Online:2015-09-30 Published:2015-10-13

摘要: 针对“把”字句在机器翻译中的困难,探索一种规则和统计相结合的“把”字句的自动释义和句式变换的方法。具体的计算步骤为:(1)根据“把”字句与其他句式的变换关系,将“把”字句分为不同的小类,并总结出每一小类的句法语义特征,得到“把”字句的语言模型;(2)选取北大中文树库中的“把”字句作为语料,并标注上每一小类句式的句法语义特征,从而得到富含句法语义信息的标注文本;(3)在此基础上,分别用组块分析的方法和完全句法分析的方法来对“把”字句进行自动识别;(4)再利用判别式机器学习的方法来对“把”字句进行自动分类。在识别结果和分类结果的基础上,根据释义模板和变换模板得到了一个“把”字句的自动释义与句式变换程序。

关键词: &ldquo, 把&rdquo, 字句, 变换分析, 框架识别, 自动分类, 自动释义

Abstract: In order to enhance the accuracy of the automatic interpretation of the bǎ-construction, this paper adopts a computational oriented and cognitive based schema, and attempts to have an automatic analysis of the semantic interpretations and syntactic alternations of the bǎ-construction. The paper firstly classifies the bǎ-construction into different subtypes based on the alternations of the bǎ-construction with other constructions and summarizes the syntactic and semantic features of each subtype. Then, it builds up a language model of the bǎ-construction, with an annotated text filled with the syntactic and semantic features of the bǎ-construction. Further, it automatically identifies the bǎ-construction based on the chunking and parsing methods, and automatically classify them based on machine learning. Finally, the paper designs an automatic interpretation and alternation system of the bǎ-construction.

Key words: the bǎ-construction, pattern alternation, frame identification, automatic classification, automatic interpretation