Computer Engineering and Applications ›› 2011, Vol. 47 ›› Issue (31): 49-52.

• 研究、探讨 • Previous Articles     Next Articles

Discover difference between sequences

ZHANG Xiaomin,CHEN Hao,MING Zhong   

  1. College of Computer Science and Software Engineering,Shenzhen University,Shenzhen,Guangdong 518060,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-11-01 Published:2011-11-01

寻找序列的变化内容

张晓敏,陈 昊,明 仲   

  1. 深圳大学 计算机与软件学院,广东 深圳 518060

Abstract: To discover the difference Z between two sequences,named X and Y,it changes X to Y through the difference information Z.If X and Y are two different versions of the same file,Z is much smaller than X.Therefore,it is a good idea to store or transfer Z instead of X.To solve that problem,X and Y can be both separated into several parts and the corresponding parts can be treated as sub-problems.In the same time the sub-sequences that are the same can be discovered.The algorithm’s implementation takes O((n+m)log(min(n,m))) time and O(n+m) space,in which n and m are the length of X and Y,respectively.

Key words: differencing algorithm, Longest Common Subsequence(LCS), difference, change

摘要: 寻找两个序列X、Y的差异内容,产生Y相对X的差异信息Z,使得X能够根据Z变化为Y。如果X、Y是同一种文件的不同版本,Z往往比Y小得多,这样存储或传送Z将比直接存储或传送Y更节约资源。提出一种基于序列结构因素的启发式算法,对两个序列进行划分,再进行相同内容的匹配。算法时间复杂度为O((n+m)log(min(n,m))),空间复杂度为O(n+m),其中n、m分别为X、Y的长度。在实际应用中该算法的执行速度是接近线性的。

关键词: 差异算法, 最长公共子序列, 差异, 变化