计算机工程与应用 ›› 2015, Vol. 51 ›› Issue (18): 7-12.

• 博士论坛 • 上一篇    下一篇

基于GPU的AC模式匹配改进算法

汪  宏1,2,王  鹏1,2   

  1. 1.中国科学院 上海高等研究院,上海 201210
    2.上海市信息技术研究中心,上海 201210
  • 出版日期:2015-09-15 发布日期:2015-10-13

Improved AC pattern matching algorithm based on GPU

WANG Hong1,2, WANG Peng1,2   

  1. 1.Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai 201210, China
    2.Shanghai Information Technology Research Center, Shanghai 201210, China
  • Online:2015-09-15 Published:2015-10-13

摘要: 字符串匹配算法的应用非常广泛,在信息检索、信息安全等领域都起着关键的作用。近年来,由于GPU通用计算的高速发展,且GPU具有很强的并行计算能力和很高的存储器访问带宽,利用GPU来加速字符串匹配算法吸引了越来越多的关注。提出的改进的AC模式匹配算法,在对前人工作的基础上,进一步消除了output表的存储,将纹理存储器中的查表操作转换为数值比较操作,与改进前算法相比,速度提高了80%以上;进一步的,引入了多个可变参数,提高AC算法的有效数据匹配率,并优化线程块的大小,优化后的算法与采用一种特殊匹配方式的高效的PFAC算法相比,速度提高了9%以上。

关键词: 图形处理器(GPU)计算, 模式匹配, Aho-Corasick算法, 统一计算架构(CUDA)编程模型

Abstract: String matching is a widely used algorithm which plays a pivotal role in areas such as information retrieval, information security, etc.. In recent years, due to the rapid development of general-purpose GPU computing, fast parallel computing ability and high bandwidth memory access of GPU, using GPU to accelerate string matching algorithm has attracted more and more attention. On the basis of the previous work, the improved AC pattern matching algorithm which is proposed in this paper, further eliminates the storage requirement of the output table, converts the look-up operations in texture memory to numerical comparison operations. Compared to the previous algorithm which doesn’t adopt this improvement, the speed is accelerated by over 80%. Moreover, several variables are introduced to enhance effective data matching rate and optimize thread block size. Compared to an efficient algorithm-PFAC, which uses a special matching mode, the optimized algorithm is more than 9% faster.

Key words: Graphic Process Unit(GPU) computing, pattern matching, Aho-Corasick algorithm, Compute Unified Device Architecture(CUDA) programming model