计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (1): 70-75.DOI: 10.3778/j.issn.1002-8331.1710-0224

• 大数据与云计算 • 上一篇    下一篇

层次序列索引的大规模动态标签图子图查询

任成林,姜丽雁,单晓欢,宋宝燕   

  1. 辽宁大学 信息学院,沈阳 110036
  • 出版日期:2019-01-01 发布日期:2019-01-07

Hierarchical Sequence Index for Subgraph Query on Large-scale Dynamic Labeled Graph

REN Chenglin, JIANGLiyan, SHAN Xiaohuan, SONG Baoyan   

  1. School of Information, Liaoning University, Shenyang 110036, China
  • Online:2019-01-01 Published:2019-01-07

摘要: 标签图常用于智能交通网、生物信息网等新兴领域的建模。子图查询作为图数据分析的关键问题,引起了研究者的广泛关注。对现有子图查询算法的研究发现,随着图数据规模增大且频繁更新,传统子图查询算法普遍存在查询效率低,存储开销大,忽略顶点标签信息等问题。为此,提出了一种支持大规模动态标签图子图查询的层次序列索引(Dynamic Hierarchical Sequence,DHS),该索引提取数据图中带有顶点编号的层次拓扑序列关系以实现子图查询;针对图的动态变化,提出了更新点拓扑扩展式索引维护策略,仅从局部变化顶点及边开始进行增量式更新,大大降低了重建索引造成的巨大开销;提出了基于DHS索引的子图查询方法,仅需将查询图与数据图的层次序列进行匹配即可获得候选集,并在其上利用关系匹配策略获得最终查询结果。实验证明提出的方法在保证高效查询的同时降低了索引的创建及维护时间,提高了子图查询效率。

关键词: 大规模动态标签图, 子图查询, 层次拓扑序列, 图索引

Abstract: Labeled graphs are often used for modeling in emerging fields such as smart transportation and bioinformatics. Subgraph query as the key problem of data analysis has attracted extensive attention of researchers. The research of existing subgraph query algorithms show that with the increasing size and frequent updating of graphs, traditional algorithms have many problems such as low query efficiency, large storage overhead and ignoring vertex label information. Therefore, a dynamic hierarchical sequence index for subgraph query on large scale dynamic labeled graph is proposed, which is named DHS. The index extracts the hierarchical topological sequence with the id of vertexes from the graph to realize subgraph query. With the changes of graph, an updating vertex topology extended index maintenance strategy is proposed, which can update from local changing vertices or edges incrementally, greatly reducing the huge overhead caused by the reconstruction index. And a subgraph query method which based on DHS index is proposed that only by matching the hierarchical sequences of query graph and data graph to obtain the candidate set, then the relational matching strategy is used to obtain final query results. Experimental results show that the proposed method improves the query efficiency and reduces the creation and maintenance cost of index.

Key words: large-scale dynamic labeled graph, subgraph query, topological hierarchical sequence, graph index