Computer Engineering and Applications ›› 2015, Vol. 51 ›› Issue (4): 115-119.

Previous Articles     Next Articles

Chinese characters conversion system based on lookup table and statistical methods

PANG Zhenjun, YAO Tianfang   

  1. Department of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai 200240, China
  • Online:2015-02-15 Published:2015-02-04

基于对照表以及语义相关性之简繁汉字转换

庞祯军,姚天昉   

  1. 上海交通大学 计算机科学与工程系,上海 200240

Abstract: There are currently two forms of Chinese characters: Mainland China and Singapore use simplified characters; Part of Hong Kong, Macao and Taiwan regions and overseas Chinese communities use traditional characters. Most of the meaning and usage of simplified and traditional Chinese characters are the same. In this situation, the conversion between them can be processed correctly through trans-coding. However, there are a considerable simplified characters which can be transformed to many Traditional characters, which is the key and difficulty of Simplified and Traditional font conversion. Based on this background, a method based on Lookup Table and statistical method is proposed. In the evaluation of conversion between simplified and traditional Chinese characters, this system ranked first at accuracy 95.6%.

Key words: Chinese character, traditional Chinese characters, conversion between simplified and traditional Chinese characters

摘要: 目前使用的汉字有简体和繁体两大形式:中国大陆和新加坡等地使用简体字,我国港澳台地区和部分海外华人社区使用繁体字。其中大多数简体字的意义和用法与对应的繁体字是一样的,具有一一对应关系,这种情况通过查找简繁对照表就可以正确处理。然而,还有相当一部分简体字对应多个繁体字,这是简繁字转换的重点和难点。基于此背景提出基于对照表以及语义相关性的简繁汉字转换方法。在教育部语信司及中国中文信息学会联合举办的一对多简繁转换评测中,此一对多简繁转换系统以95.6%的准确率排名第一。

关键词: 简体字, 繁体字, 简繁体字转换, 一对多简繁转换