Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (8): 81-88.DOI: 10.3778/j.issn.1002-8331.2208-0292

• Theory, Research and Development • Previous Articles     Next Articles

Design and Optimization of Query Operator on GPU

LENG Fangling, LIU Jun, WU Yingying, BAO Yubin   

  1. 1.School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China
    2.Office of Information Construction and Network Security, Northeastern University, Shenyang 110169, China
  • Online:2023-04-15 Published:2023-04-15

GPU上的查询算子的设计与优化

冷芳玲,刘军,吴莹莹,鲍玉斌   

  1. 1.东北大学 计算机科学与工程学院,沈阳 110169
    2.东北大学 信息化建设与网络安全办公室,沈阳 110169

Abstract: Selection, connection, projection and aggregation are the basic operations in traditional relational database. In order to realize the query optimization of relational database on GPU, the corresponding GPU algorithm must be used to realize the corresponding relational operator. Referring to the hierarchical design idea of divide and conquer of GDB, relational algebra is divided into operator layer and primitive layer. There are some difficult problems in the process of data query processing, such as data transmission delay, excessive use of shared memory, reduction of the number of active threads and communication delay caused by data communication between threads. To solve these problems, the query optimization algorithm is implemented based on the relatively new Pascal architecture. Based on the principle of the original connection, aggregation and condition selection algorithm, the corresponding algorithm is designed and optimized. The workload of each working thread is increased, the delay hiding between kernel computing and data transmission is realized, and the problem of data skew in connection operation is solved.

Key words: graphics processing unit(GPU), Pascal architecture, query operator, primitive operation

摘要: 选择、连接、投影和聚集等是传统关系型数据库中的基本操作。为了实现关系型数据库在GPU上的查询优化,必须使用相应的GPU算法实现对应的关系算子。借鉴GDB分而治之的分层设计思想将关系代数拆分成算子层和原语层。数据查询处理过程中存在着一些难点问题,如数据传输时延、过度使用共享内存、活跃线程数减少和线程之间数据通信产生的通信时延。针对这些问题,基于较新的Pascal架构实现了查询优化算法,在原有的连接、聚集和条件选择算法原理基础上,对相应的算法进行了设计与优化。提高了每个工作线程的工作负载,实现了内核计算与数据传输之间的延迟隐藏,解决了连接操作中的数据倾斜问题。

关键词: 图形处理器(GPU), Pascal架构, 查询算子, 原语操作