计算机工程与应用 ›› 2012, Vol. 48 ›› Issue (29): 151-156.

• 数据库、信号与信息处理 • 上一篇    下一篇

现代藏文音节字自动校对研究

关  白1,才科扎西2   

  1. 1.西藏大学 计算机科学技术系,拉萨 850000
    2.西北民族大学 数学与计算机科学学院,兰州 730030
  • 出版日期:2012-10-11 发布日期:2012-10-22

Research on modern Tibetan syllables word automatically proofreading

GUAN Bai1, CAI Kezhaxi2   

  1. 1.Department of Computer Science and Technology, Tibet University, Lhasa 850000, China
    2.School of Mathematics and Computer Science, Northwest University for Nationalities School, Lanzhou 730030, China
  • Online:2012-10-11 Published:2012-10-22

摘要: 在现代藏文自动校对中,对音节字(<f:\计算机工程与应用201229\201229图\(28)1104-0037 关白 才科扎西\ImgB361.jpg>)的校对是其基础。现代藏文二维的书写格式和独特的文法,还有格助词的黏着现象、音节字搭配规则和音节字中真词和非词错误等众多问题,使得对藏文自动校对的研究有别于英语和汉语的自动校对。针对现代藏文中音节字的特点,通过音节字预处理、字表匹配、混淆集匹配、二元接续关系、最小编辑距离法等方法对现代藏文音节字的自动校对进行详细论述。

关键词: 藏文自动校对, 音节字, 真词错误, 黏着性格助词

Abstract: The syllables word proofreading is basic in the modern Tibetan automatically proofreading. The modern Tibetan has a lot of errors of two-dimensional writing format, unique grammar, the case auxiliary word bond phenomenon, syllables words connect rules, and the really-word and no-word in the syllables word etc that conduce to the Tibetan automatically proofreading difference between the English and Chinese. According to the characters of modern Tibetan syllables word, it uses the word syllables word pretreatment, word table matching, mixture set matching, the duality relationship, minimum edit distance etc to discuss the modern Tibetan syllables word automatically proofreading in detail.

Key words: Tibetan automatically proofreading, syllables word, really-word error, bonding case auxiliary word