Computer Engineering and Applications ›› 2019, Vol. 55 ›› Issue (24): 159-163.DOI: 10.3778/j.issn.1002-8331.1808-0385

Previous Articles     Next Articles

Transition-Based Kazakh Parsing with Neural Network

BAI Yawen, Gulia Altenbek   

  1. 1.College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
    2.Xinjiang Laboratory of Multi-Language Information Technology, Xinjiang University, Urumqi 830046, China
    3.The Base of Kazakh and Kirghiz Language of National Language Resource Monitoring and Research Center on Minority Languages, Xinjiang University, Urumqi 830046, China
  • Online:2019-12-15 Published:2019-12-11

基于转移的神经网络哈萨克语句法分析

白雅雯,古丽拉·阿东别克   

  1. 1.新疆大学 信息科学与工程学院,乌鲁木齐 830046
    2.新疆大学 新疆多语种信息技术实验室,乌鲁木齐 830046
    3.新疆大学 国家语言资源监测与研究少数民族语言中心哈萨克语和柯尔克孜语文基地,乌鲁木齐 830046

Abstract: For purpose of improving the parsing accuracy of Kazakh and laying the foundation for natural language processing, it researches Kazakh parsing based on transfer, and uses an improved transition-based method to deal with the syntax tree and convert the syntax tree into an action sequence, this method is in-order traversal over syntactic trees. The neural network is used to construct the parser framework, and three long short-term memory are used to express the stack information, buffer information and action history information to train the model. According to the probability of predicting the sequence of action, the result of syntactic analysis is obtained. The accuracy of Kazakh parsing obtained by the improved transition-based method is 74.37%.

Key words: parsing, transfer method, long short-term memory

摘要: 为了进一步提高哈萨克语句法分析的准确率,为哈萨克语自然语言处理奠定良好基础,对基于转移的哈萨克语句法分析进行研究,采用改进后的基于转移的方法对句法树进行处理,即中序遍历句法树的方法将句法树转换为动作序列。使用神经网络构建句法分析器框架,分别使用三个长短期记忆网络(LSTM)表示堆栈信息、缓冲区信息以及动作历史信息对模型进行训练,根据所得到的概率预测动作序列,从而得到句法分析的结果。改进后的转移方法得到的句法分析准确率为74.37%。

关键词: 句法分析, 转移方法, 长短期记忆网络(LSTM)