Computer Engineering and Applications ›› 2013, Vol. 49 ›› Issue (8): 146-150.

Previous Articles     Next Articles

Research about Tibetan-sort based on ISO/IEC 10646(Tibetan)

Wan-mezhaxi1, Ni-mazhaxi2   

  1. 1.Department of Computer Science, College of Engineering, Tibet University, Lhasa 850000, China
    2.Modern Education Technology Center, Tibet University, Lhasa 850000, China
  • Online:2013-04-15 Published:2013-04-15

小字符集现代藏文排序技术的研究

完么扎西1,尼玛扎西2   

  1. 1.西藏大学 工学院 计科系,拉萨 850000
    2.西藏大学 现代教育技术中心,拉萨 850000

Abstract: The component letters of Tibetan syllables have certain ordering, each Tibetan character has stipulated the sorting code in the ISO/IEC 10646(Tibetan), but the structural complexity of Tibetan syllables cause that Tibetan cannot be sorted according to the order of letters which form Tibetan syllables and cannot use their sorting codes directly, this paper proposes the Tibetan-sort algorithm based on the ISO/IEC 10646(Tibetan), the main idea is: it reads in Tibetan syllables from the text, and transforms them into the one-dimensional letters string; it recognizes the base characters and adjusts the order of letters which form Tibetan syllables and add corresponding blank characters in the positions of lacking letters which form Tibetan syllables; it sorts Tibetan syllable string with the quick-sort method; it adjusts the ordering of component letters of Tibetan syllables back to the original ordering, removes the blank characters, and outputs as well.

Key words: Tibetan , syllable, Tibetan dictionary sort rules, ISO/IEC 10646(Tibetan), Tibetan-sort

摘要: 构成藏文音节的字母具有一定的顺序,ISO/IEC 10646(Tibetan)中每个藏文字符规定了排序码,但是藏文音节的构造复杂性使得藏文不能直接按构成藏文音节的字母顺序来排序,也不能直接应用这些排序码,提出了基于ISO/IEC 10646(Tibetan)的藏文排序算法,主要思想是:从文本中读入藏文音节,并把它转化为一维的字母串;识别基字及调整构成藏文音节的字母(构件)顺序,并且在缺构件位置上添加相应的空格符;用快速排序法对藏文音节串进行排序;构成藏文音节的字母(构件)顺序调回到原来的顺序,去除空格符,并输出。

关键词: 藏文音节, 现代藏文字、词典排序规则, ISO/IEC 10646(Tibetan), 藏文排序