Computer Engineering and Applications ›› 2010, Vol. 46 ›› Issue (19): 129-131.DOI: 10.3778/j.issn.1002-8331.2010.19.037

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Improved sort method for Chinese strings

ZHANG Hai-jun1,2,3,DING Xi-yuan2,ZHU Chao-yong2,3   

  1. 1.Department of Computer Science and Technology,Xinjiang Normal University,Urumqi 830054,China
    2.Research Center of Computer & Language Information Engineering,Chinese Academy of Sciences,Beijing 100097,China
    3.School of Computer Science and Technology,University of Science and Technology of China,Hefei 230027,China

  • Received:2009-12-28 Revised:2010-03-22 Online:2010-07-01 Published:2010-07-01
  • Contact: ZHANG Hai-jun

一种改进的中文字符串排序方法

张海军1,2,3,丁溪源2,朱朝勇2,3   

  1. 1.新疆师范大学计算机科学与技术系,乌鲁木齐830054
    2.中国科学院计算机语言信息工程研究中心,北京100097
    3.中国科学技术大学计算机科学与技术学院,合肥230027
  • 通讯作者: 张海军

Abstract: At present,the time complexity of the fastest sort algorithm for Chinese strings is Onlgn).Radix sort algorithm,
whose time complexity is Odn),is one of the most efficient sort methods,but it is fit for integer data with identical digits.
This paper puts forward a fast transform method used to convert strings to an integer arrays with identical length.The integer
arrays representing strings are sorted by radix sort algorithm to achieve rapid sort for Chinese strings.Experiments show
that the improved algorithm can quickly sort Chinese strings and its performance is better than that of quick sort algorithm.
The relationship between sort time and data size is linear and the time complexity of the algorithm is Odn).

摘要: 对中文字符串排序,最快算法的时间复杂度是Onlgn)。基数排序算法是目前最快的排序方法之一,时间复杂度是Odn),但其一般适用于相同长度的整型数据排序。提出了一种快速的变换方法,将字符串转换为与之等长的整型数组,使用基数排序算法对代表字串的整型数组排序,用以实现对字符串的快速排序。实验表明,提出的算法能快速地进行中文字符串排序,比快速排序算法具有更好的性能,且排序时间与数据规模之间是线性关系,算法的时间复杂度为Odn)。

CLC Number: