Computer Engineering and Applications ›› 2016, Vol. 52 ›› Issue (17): 84-92.

Previous Articles     Next Articles

Twig query optimization based on partial evaluation and hot-trace compilation

WAN Ganghui1, LIAO Husheng2, SU Hang1, GAO Hongyu1, GAO Wanchen1   

  1. 1.College of Computer Science, Beijing University of Technology, Beijing 100124, China
    2.School of Software Engineering, Beijing University of Technology, Beijing 100124, China
  • Online:2016-09-01 Published:2016-09-14

基于部分求值和热踪编译的Twig查询优化方法

万刚辉1,廖湖声2,苏  航1,高红雨1,高万辰1   

  1. 1.北京工业大学 计算机学院,北京 100124
    2.北京工业大学 软件学院,北京 100124

Abstract: XML tree pattern query also called Twig query, is the core operation of the XML query. In the study of Twig query algorithm, TreeMatch algorithm is considered to be one of the best Twig query algorithm, since it reduces intermediate results to a great extent. However, in the core operation getNext of TreeMatch algorithm, there are many calculations relying on the Twig pattern. When the number of getNext calls is a lot, the redundant calculation will affect the performance of TreeMatch algorithm. To obtain higher improvement, this paper proposes a novel optimization method for this algorithm based on partial evaluation and hot-trace compilation. Firstly, the method takes Twig pattern as the invariant to do partial evaluation, and translates the query into a sequence of Twig query machine instructions, which avoids some redundant computation in Twig pattern query process. Secondly, the method does hot-trace compilation optimization for the instruction sequence to achieve more efficiency in interpretive execution. Experimental results show that the efficiency of Twig query will increase by 20% to 60% using the optimization method.

Key words: Twig, TreeMatch, partial evaluation, hot-trace compilation

摘要: XML树模式查询又称为Twig查询,是XML查询处理中最核心的操作。在Twig查询算法的研究中,TreeMatch算法由于极大程度上减少了中间结果的产生,被认为是最好的Twig查询算法之一。然而,在TreeMatch算法的核心操作getNext中,存在不少仅依赖Twig模式的计算。当getNext调用次数很多时,这种冗余的重复计算会影响TreeMatch算法的性能。为了进一步改进该算法,提出了一种基于部分求值和热踪编译的Twig查询优化方法,该方法以Twig模式作为不变量进行部分求值,把查询请求翻译成一种Twig查询机指令序列,避免了查询过程中对Twig模式的重复计算;并且针对这种查询机指令序列的解释过程,利用热踪编译技术进行了优化。对比实验说明基于部分求值和热踪编译的优化方法能够将Twig查询效率提高到20%到60%。

关键词: Twig, TreeMatch, 部分求值, 热踪编译