Computer Engineering and Applications ›› 2017, Vol. 53 ›› Issue (20): 8-13.DOI: 10.3778/j.issn.1002-8331.1707-0132

Previous Articles     Next Articles

Word alignment-based Chinese deep semantic parsing

ZHENG Xiaodong1, HU Hanhui1, ZHAO Lindu1, LV Yongtao2   

  1. 1.School of Economics and Management, Southeast University, Nanjing 211189, China
    2.School of Computer Science and Engineering, Southeast University, Nanjing 211189, China
  • Online:2017-10-15 Published:2017-10-31

一种基于词对齐的中文深层语义解析模型

郑晓东1,胡汉辉1,赵林度1,吕永涛2   

  1. 1.东南大学 经济管理学院,南京 211189
    2.东南大学 计算机科学与工程学院,南京 211189

Abstract: Semantic parsing is the task of transforming natural-language sentences into complete, formal, symbolic Meaning Representations (MR) suitable for reasoning or machine-understanding. In recent years, the research of semantic parsing in English has made great progress. However, little work has been done in Chinese semantic parsing. There are inherent differences between Chinese and English, therefore one cannot simply apply methods, which are feasible for English, to Chinese. This paper proposes a statistical approach called WACSP aiming at Chinese semantic parsing, which considers the process of converting Chinese sentence into its corresponding meaning as a machine translation procedure. At first, it turns the frequently-used dataset GEOQUERY into Chinese dataset, in which each data contains a Chinese sentence and its accurate meaning. Then it uses the word alignment model to acquire the bilingual dictionary made up by the Chinese natural language string and its meaning. In the end, it determines the ultimate semantic analysis by learning a statistical model. Experimental results show that WACSP performs well with higher precision and coverage.

Key words: natural language processing, semantic parsing, word alignment model

摘要: 语义解析是指将自然语言句子转化成便于机器理解和推理的意义形式。近年来英文语义解析的研究取得了很大进展。然而,中文语义解析的相关工作则相对较少。中文和英文之间存在一定的差异,适用于英文的语义解析方法不一定适合中文。因此,针对中文的语言特点,提出一种基于词对齐的中文语义解析方法,将中文句子转化成其相应的意义表示看作是一个机器翻译的过程。首先将英文语义解析方法中常用的训练数据集GEOQUERY转化成中文数据集,数据集中每条训练数据包括一个中文句子及其正确的意义表示。然后利用词对齐模型来获取由中文自然语言字符串及其相应的意义表示所组成的双语词典。最后通过学习一个概率估计模型来确定最终的语义解析模型。实验结果表明,WACSP有较高的精确度和覆盖率。

关键词: 自然语言处理, 语义解析, 词对齐模型