计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (23): 136-139.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

融入MD5的HASH线性获取增量算法研究

郭  亮,杨金民   

  1. 湖南大学 信息科学与工程学院,长沙 410082
  • 出版日期:2014-12-01 发布日期:2014-12-12

Research of incremental extraction based on MD5 and HASH algorithm

GUO Liang, YANG Jinmin   

  1. College of Information Science and Engineering, Hunan University, Changsha 410082, China
  • Online:2014-12-01 Published:2014-12-12

摘要: 为了实现数据库中的快速增量提取,在剖析传统的增量提取方法上,提出了一种融入MD5的HASH线性扫描来获取增量的算法。数据库中的每条记录都可视为一个字符串,利用HASH算法生成备份记录的散列表,通过原始记录去散列表中探测来达到线性扫描就能获取增量的目的,减少了比对次数;同时利用MD5算法生成每条记录的“指纹”,降低了每次HASH运算和比对的字符串长度,提高了效率。对所提出算法在ORACLE数据库上进行了应用测试,结果表明该算法效率较传统方法有很大提高。

关键词: 增量提取, MD5算法, HASH算法, 线性扫描

Abstract: To achieve rapid incremental extraction of database, an algorithm which is blended MD5 in HASH linear scanning to obtain increment is put forward based on analyzing the traditional incremental extraction. Each record in database can be seen as a character string and it can be generated into hash table as duplicate record, which is explored in hash table through traditional record to obtain increment and decrease frequency of comparison. Meanwhile, the fingerprint of each record can be generated with using MD5 algorithm, which reduces the length of character string in every HASH algorithm and comparison and improves efficiency. This algorithm is applicably tested in ORACLE database and the result shows that it is improved on calculative efficiency at a large extent compared with traditional algorithm.

Key words: incremental extraction, MD5 algorithm, HASH algorithm, linear scan