Computer Engineering and Applications ›› 2012, Vol. 48 ›› Issue (12): 43-48.

Previous Articles     Next Articles

gAC: high performance AC algorithm based on GPU

CHEN Hu1, PENG Jiangfeng2, SHI Shaohuai1   

  1. 1.School of Software Engineering, South China University of Technology, Guangzhou 510006, China
    2.School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China
  • Online:2012-04-21 Published:2012-04-20

gAC:基于GPU的高性能AC算法

陈  虎1,彭江锋2,施少怀1   

  1. 1.华南理工大学 软件学院,广州 510006
    2.华南理工大学 计算机科学与工程学院,广州 510006

Abstract: As one of the oldest and most pervasive problems in computer science, string matching has become the kernel algorithm in the fields of the information retrieval and computational biology. However, limited CPU computing power and memory access bandwidth make the traditional serial string matching algorithm difficult to be further improved. On the other hand, with the development of GPGPU technology, GPU, with powerful computing ability and large memory access bandwidth, has made lots of outstanding achievements in many applications. gAC is a high performance parallel multi-string matching algorithm based on GPU, taking advantages of GPU’s technical characteristics, such as SIMT(Single-Instruction Multiple-Thread) and coalesced memory access, to achieve reduction of conditional branches and other global memory access optimization. These make string scanning speed up to 51 Gb/s in the C1060’s GPU, which gets 28 times higher than the serial algorithm based on CPU.

Key words: Graphic Processing Unit(GPU), Compute Unified Device Architecture(CUDA), parallel multi-string matching, parallel computation, AC algorithm

摘要: 字符串匹配是计算科学中研究最广泛的问题之一,已成为信息检索和生物计算等领域的核心操作。然而受限于CPU的计算能力和存储器访问带宽,传统的串行字符串匹配算法难以进一步提升性能。GPU在计算能力和存储器访问带宽上有很大提升,已经在很多应用上取得了卓越成效。gAC作为一种基于GPU的并行AC算法,针对GPU的SIMT(Single-Instruction Multiple-Thread)以及合并存储器访问的技术特点,采取了减少条件分支、合并访问全局存储器等优化方法,使得在C1060 GPU上的字符串扫描速度达到51 Gb/s,比基于CPU的串行算法提升了28倍。

关键词: 图形处理器(GPU), 计算统一设备框架(CUDA), 多字符串匹配, 并行计算, AC算法