Computer Engineering and Applications ›› 2017, Vol. 53 ›› Issue (5): 73-80.DOI: 10.3778/j.issn.1002-8331.1507-0170

Previous Articles     Next Articles

Optimizing parallel join of column-stores on heterogeneous computing platform

DING Xiangwu, CHEN Jinxin, WANG Mei   

  1. Institute of Computer, Donghua University, Shanghai 201620, China
  • Online:2017-03-01 Published:2017-03-03


丁祥武,陈金鑫,王  梅   

  1. 东华大学 计算机学院,上海 201620

Abstract: GPU and integrated CPU-GPU architecture has powerful parallel processing capability and programmable pipeline, which gradually becomes a hot area of database researches. In order to fully explore the parallel abilities of heterogeneous platform, enhance the performance of the column-storage database query, in this paper, it takes full account of differences of system architecture based on heterogeneous platforms, firstly proposes the improved multidimensional data classification method of data partition strategy ICMD based on improving the multi-dimensional data partitioning method(CMD), using stream processor to process sub-space join operation in parallel. Secondly, through the implementation of query dynamic load using task allocation model evaluation, it makes the query execution in parallel between multi-core CPU, GPU and other accelerator components. At the same time, it uses on-chip global synchronization and efficient implementation, local memory reuse optimization ICMD connection algorithm. Using SSB benchmark test, the experimental results show that  based on the platform of Intel HD Graphics 4600, ICMD connection query receives 1.35 speedup compared to the CPU version and receives 18% performance improvement compared with Ocelot of GPU query engine.

Key words: multi-core Central Processing Unit-Graphics Processing Unit(CPU-GPU), stream processor, heterogeneous program, column storage, Improved Coordinate Module Distribution(ICMD), dynamic evaluation of task allocation

摘要: GPU以及集成式的CPU-GPU架构凭借其强大的并行处理能力和可编程流水线方式,已经成为数据库领域的研究热点。为充分利用异构平台的并行计算能力,提升列存储系统的查询性能,在研究异构平台结构特性的基础上,首先提出了GPU多线程平台上进行连接的数据划分策略——ICMD(Improved CMD),利用GPU流处理器并行处理各个子空间上的连接,然后利用任务评估分配模型实现查询负载的动态分配,使得查询操作能在多核CPU、GPU上高效并行执行。同时利用片上全局同步机制、局部内存重用技术优化ICMD连接算法。最后采用SSB基准测试集测试,结果表明:Intel? HD Graphics 4600平台上并行连接查询相比于CPU版本获得了35%的性能提升,较GPU查询引擎的Ocelot性能上提升了18%。

关键词: 多核中央处理器-图形处理器(CPU-GPU), 流处理器, 异构编程, 列存储, 改进协调模块分布(ICMD), 任务动态评估分配