计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (17): 72-80.DOI: 10.3778/j.issn.1002-8331.2112-0521

• 理论与研发 • 上一篇    下一篇

并行查询交互度量及执行计划选择

柳浩楠,牛保宁,程永强   

  1. 太原理工大学 信息与计算机学院,山西 晋中 030600
  • 出版日期:2022-09-01 发布日期:2022-09-01

Measurement for Parallel Query Interaction and Execution Plan Selection

LIU Haonan, NIU Baoning, CHENG Yongqiang   

  1. College of Information and Computer, Taiyuan University of Technology, Jinzhong, Shanxi 030600, China
  • Online:2022-09-01 Published:2022-09-01

摘要: 查询是数据库系统的主要负载,查询的执行效率直接影响着系统的性能。目前,由于查询交互(query interaction,QI)复杂多变,查询优化器不能准确地评估查询进入系统产生的影响,很难为并行查询选择较优执行计划。将查询的平均响应时间、平均执行时间、平均I/O时间和平均缓冲区命中率作为QI的特征参数,表示QI;提出多维度查询交互度量(multi-dimensional measurement of query interaction,MMQI)模型和执行计划选择(execution plan selection,EPS)模型,采用深度神经网络,在度量QI的基础上,把QI作为主要因素,为并行查询选择较优执行计划。考虑到查询执行计划是由一系列关系运算组成的,以及QI具有时域特性,MMQI采用双向长短期记忆神经网络(bidirectional long-short term memory,Bi-LSTM)度量QI,从查询执行计划提取特征作为输入,将QI特征参数的改变作为输出,预测查询采用不同执行计划进入系统后QI特征参数的改变;EPS把预测到的查询特征参数的改变作为查询交互特征(feature of query interaction,FQI),与查询候选执行计划特征(features of candidate plan,FCP)融合,作为另一个Bi-LSTM的输入,为查询动态地选择较优执行计划。在PostgreSQL上的实验表明,MMQI-EPS比查询优化器选择较优执行计划的平均准确率提高38.6个百分点。

关键词: 查询交互, 并行查询, 多维度查询交互度量-执行计划选择(MMQI-EPS), 较优执行计划, 神经网络

Abstract: Query is the main load of the database system, and the execution efficiency of the query directly affects the performance of the system. At present, due to the mutual complex and changeable influence between queries called query interaction(QI), the query optimizer cannot accurately evaluate the impact of queries entering the system, and it is difficult to choose a better execution plan for parallel queries. It regards mean response time, mean execution time, mean I/O time, and mean buffer hit ratio as QI parameters to measure QI. It proposes a multi-dimensional measurement of query interaction(MMQI) model and an execution plan selection(EPS) model, using a deep neural network, and on the basis of measuring QI, taking QI as the main factor of a better execution plan selection for parallel queries. Considering that the query execution plan is composed of a series of relational operations, and QI has time sequential characteristics, MMQI uses bidirectional long-short term memory(Bi-LSTM) to measure QI and extracts features of execution plan as the input of Bi-LSTM and take the change of QI parameters as the output of the model. EPS combines the predicted change of QI parameters and features of candidate plan(FCP) as the input of another Bi-LSTM to dynamically select a better execution plan for the query. Experiments on PostgreSQL show that the average accuracy of MMQI-EPS is 38.6 percentage points higher than the query optimizer in selecting a better execution plan.

Key words: query interaction, parallel query, multi-dimensional measurement of query interaction-execution plan selection(MMQI-EPS), better execution plan, neural network