计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (13): 84-92.DOI: 10.3778/j.issn.1002-8331.1905-0266

• 大数据与云计算 • 上一篇    下一篇

面向RDF图的多模式匹配方法

孙云浩,李逢雨,李冠宇,韩冰,邢维康   

  1. 大连海事大学 信息科学技术学院,辽宁 大连 116026
  • 出版日期:2020-07-01 发布日期:2020-07-02

Multi-pattern Matching Method for RDF Graph

SUN Yunhao, LI Fengyu, LI Guanyu, HAN Bing, XING Weikang   

  1. School of Information Science Technology, Dalian Maritime University, Dalian, Liaoning 116026, China
  • Online:2020-07-01 Published:2020-07-02

摘要:

模式匹配问题指的是搜索所有同构于模式图的数据子图,它是一种典型的子图同构问题。多模式匹配问题是对模式匹配问题的一个扩展,其主要的挑战是多个模式图之间的并发执行策略。为了应对这个挑战,提出一种面向RDF图的模式匹配方法(M-PM)。通过计算多个模式图之间的公共查询子图,根据查询子图、模式图的包含关系构建依赖树;提出节点分片表的概念,用来扩展依赖树中单一的包含关系;设计了一种快速的多模式匹配算法,其通过对数据图的一次遍历便可以求得多个模式图的匹配子图。实验结果表明,M-PM方法比一般方法提高了约70%执行时间效率。在处理相同规模的模式图的情况下,M-PM方法执行效率只与残差边个数有关,残差边越少执行效率越高。

关键词: RDF图, 多模式匹配, 依赖树, 节点分片表

Abstract:

The problem of multi-pattern matching is an extension of pattern matching, whose main challenge is the concurrent execution strategy among multiple query graphs. To solve this challenge, a Multi-Pattern Matching(M-PM) method for RDF graph is proposed. Firstly, a Dependency Tree(D-Tree) is constructed based on the inclusion relations of multiple query graphs and maximum public query subgraphs. Secondly, a Node Fragmentation Table(NFT) is proposed to extend the single inclusion relation of D-Tree. Finally, a speed up algorithm of multi-pattern matching is designed, which can obtain the results of multiple query graphs by one-off traversal of data graph. The experimental results show that M-PM method can improve the matching time-efficiency of more 70% than general method. With the fixed scales of query graphs, matching time-efficiency is consistent with the quantity of residual edges, where the more the residual edges are and the higher the time-efficiency is.

Key words: RDF graph, Multi-Pattern Matching(M-PM), Dependency Tree(D-Tree), Node Fragmentation Table(NFT)