计算机工程与应用 ›› 2010, Vol. 46 ›› Issue (16): 135-139.DOI: 10.3778/j.issn.1002-8331.2010.16.040

• 数据库、信号与信息处理 • 上一篇    下一篇

双语词典在统计机器翻译中的应用

王 丽,韩习武   

  1. 黑龙江大学 信息技术研究所,哈尔滨 150082
  • 收稿日期:2008-11-21 修回日期:2009-02-02 出版日期:2010-06-01 发布日期:2010-06-01
  • 通讯作者: 王 丽

Application of bilingual dictionary in statistical machine translation

WANG Li,HAN Xi-wu   

  1. Institute of Information Technology,Heilongjiang University,Harbin 150082,China
  • Received:2008-11-21 Revised:2009-02-02 Online:2010-06-01 Published:2010-06-01
  • Contact: WANG Li

摘要: 在当前的基于统计的翻译方法中,双语语料库的规模、词对齐的准确率对于翻译系统的性能有很大的影响。虽然大规模语料库可以改善词语对齐的准确度,提高系统的性能,但同时会以增加系统的负载为代价,因此目前对于统计机器翻译方法的研究在使用大规模语料库的基础上,同时寻求其他可以提高系统性能的方法。针对以上问题,提出一种把双语词典应用在统计机器翻译中的方法,不仅优化了词对齐的准确率,而且得出质量更高的翻译结果,在一定程度上缓解了数据稀疏问题。

关键词: 统计机器翻译, 双语词典, 双语语料库

Abstract:

Based on the current statistical machine translation,the size of corpus and the accuracy of word alignment mainly affect the performance of SMT systems.Though large bilingual corpus can improve the accuracy of word alignment and the performance of the system,at the cost of increasing the load of the system at the same time.So nowdays the research about statistical machine translation is not only on the basis of using large bilingual corpus,but also seeks other methods to improve the performance of the system.This paper proposes an approach,in which the bilingual dictionary is integrated in the SMT system.The approach can improve the accuracy of word alignment,and can also get a better result,and to a certain extent,the problem of Sparse Data is eased.

Key words: statistical machine translation, bilingual dictionary, bilingual corpus

中图分类号: