计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (35): 150-154.

• 数据库、信号与信息处理 • 上一篇    下一篇

汉维/维汉统计机器翻译中若干问题研究

徐 春1,杨 勇2,董兴华3   

  1. 1.新疆财经大学 计算机科学与工程学院,乌鲁木齐 830012
    2.新疆师范大学 计算机科学技术学院,乌鲁木齐 830054
    3.中国科学院 新疆理化技术研究所,乌鲁木齐 830011
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-12-11 发布日期:2011-12-11

Research on aspects of statistical machine translation between Chinese and Uyghur

XU Chun1,YANG Yong2,DONG Xinghua3   

  1. 1.College of Computer Science and Engineering,Xinjiang University of Finance & Economics,Urumqi 830012,China
    2.College of Computer Science and Technology,Xinjiang Normal University,Urumqi 830054,China
    3.Xinjiang Technical Institute of Physics & Chemistry,Chinese Academy of Sciences,Urumqi 830011,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-12-11 Published:2011-12-11

摘要: 针对汉语和维吾尔语形态差别较大的特点,借助开源的Moses工具箱,通过各种翻译模型的对比及相关实验结果的分析,深入探讨了对汉维/维汉翻译有影响的各种因素,包括词对齐问题,汉维翻译中主语、谓语中心词、时态等的一致性问题,维汉翻译中OOV的问题,汉维句法结构差异问题。最后给出了提高汉维/维汉统计翻译性能的一些建议。

关键词: 汉维, 维汉, 词对齐, 一致性, 句法结构

Abstract: For the characteristics of large morphological difference between Chinese and Uyghur,with the help of open-source moses toolkit,this paper compares the various translation model in it and analyzes the relevant experimental results.Based on this a variety of factors which may affect the performance of translation between Chinese and Uyghur is deeply studied,including related issues with words alignment,consistency of subject-predicate-tenses in Chinese-Uyghur translation,out-of-vocabulary in Uyghur-Chinese translation,different syntactic structure.Finally,some suggestions are given that how to improve the performance of translation between Chinese and Uyghur.

Key words: Chinese-Uyghur, Uyghur-Chinese, words alignment, consistency, syntactic structure