Computer Engineering and Applications ›› 2011, Vol. 47 ›› Issue (35): 147-149.
• 数据库、信号与信息处理 • Previous Articles Next Articles
TIAN Shengwei1,YU Long2,YANG Feiyu3
Received:
Revised:
Online:
Published:
田生伟1,禹 龙2,杨飞宇3
Abstract: This paper proposes an improved adaptive algorithm for Chinese-Uyghur sentence alignment.Traditional alignment methods can not well adapt to change in types of corpus,the algorithm makes ues of current Chinese-Uyghur text length ratio of bytes and historical matching model,modifies the alignment model parameters dynamically to meet the changes in types of corpus and improves sentence alignment algorithm performance.Compared with alignment algorithm based on length,alignment improves alignment accuarcy 3.5 percentage and recall 2.7 percentage,compared with mixed-aligned model,alignment improves 1.9 percentage and 1.8 percentage.Experimental results show that the algorithm can adapt to change in types of corpus well.
Key words: bilingual corpora, sentence alignment, adaptive
摘要: 提出了改进的自适应汉维句子对齐算法对齐汉维语句子。针对传统对齐方法不能较好地适应语料类型的变化,算法利用当前待对齐汉维文本的字节长度比和历史匹配模式数据,动态修正对齐模型的参数,使其适应语料类型的变化,提高了汉维句子对齐算法的性能,对齐的正确率和召回率较长度对齐模型分别提高了3.5个百分点和2.7个百分点,较混合对齐提高了1.9个百分点和1.8个百分点。实验结果验证了该算法能够有效地适应语料类型的变化。
关键词: 双语语料, 句子对齐, 自适应
TIAN Shengwei1,YU Long2,YANG Feiyu3. Improved adaptive algorithm for Chinese-Uyghur sentence alignment[J]. Computer Engineering and Applications, 2011, 47(35): 147-149.
田生伟1,禹 龙2,杨飞宇3. 改进的自适应汉维句子对齐[J]. 计算机工程与应用, 2011, 47(35): 147-149.
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/
http://cea.ceaj.org/EN/Y2011/V47/I35/147