Querying on evolving graphs based on compressed full-text index

Computer Engineering and Applications ›› 2015, Vol. 51 ›› Issue (2): 117-124.

Previous Articles Next Articles

Querying on evolving graphs based on compressed full-text index

XIAO Yang1，2, ZHU Qing1，2, WU Yuewan1

1.Department of Computer Science, School of Information, Renmin University of China, Beijing 100872, China
2.Key Laboratory of Data Engineering and Knowledge Engineering, School of Information, Renmin University of China, Beijing 100872, China

Online:2015-01-15 Published:2015-01-12

基于压缩全文索引的演变图查询

肖洋1，2，朱青1，2，吴粤皖1

1.中国人民大学信息学院计算机系，北京 100872
2.中国人民大学信息学院数据工程与知识工程教育部重点实验室，北京 100872

Abstract

Abstract: Evolving graph contains large amount of temporal and spatial information, some of which always perform in similar evolving rules. This paper gives a query model, mining for the evolving subgraphs whose edges changing in the same way at the same time range. However, the size of evolving graphs in real world is huge. Querying on it repeatedly will cost a lot. Even though the existing index method based on Hash has solved query problem, it is also faced in challenge of preprocessing. In order to reduce the price of preprocessing in mass evolving graph, it proposes a compressed full-text indexing technique. It is based on Burrows-Wheeler transform and suffix array. In constructing a suffix array, it also gives two different linear algorithms, ensuring the stability of preprocessing. It evaluates the feasibility, efficiency and scalability of the algorithm on Facebook, Enron email system and simulated datasets.

Key words: evolving graph, query, evolving subgraph, suffix array, compressed full-text index

摘要： 演变图中含有大量的时间和空间信息，其中某些空间信息随着时间的推移表现出相似的演变规律。给出了一种演变图查询模型，可以挖掘出在相同时间范围内具有相同变化规律的演变子图。但是演变图的规模往往是巨大的，当需要对其进行多次查询时，每次遍历整个演变图将带来非常高的查询代价，而现有的基于枚举的哈希索引算法又使得预处理过程拥有相当大的时间和空间开销，为了减少对大规模演变图的预处理代价，将压缩的全文索引技术应用于演变图，它基于涡轮转换和后缀数组。在构建后缀数组时，给出了两种不同的线性算法，确保了预处理过程的稳定性。通过在Facebook、Enron邮件系统以及模拟数据集上的实验，评估了该算法的可行性、效率以及可扩展性。

关键词: 演变图, 查询, 演变子图, 后缀数组, 压缩全文索引

XIAO Yang1，2, ZHU Qing1，2, WU Yuewan1. Querying on evolving graphs based on compressed full-text index[J]. Computer Engineering and Applications, 2015, 51(2): 117-124.

肖洋1，2，朱青1，2，吴粤皖1. 基于压缩全文索引的演变图查询[J]. 计算机工程与应用, 2015, 51(2): 117-124.

[1]	LYU Xin, ZHAO Liancheng, YU Jiyuan, TAN Bin, ZENG Tao, CHEN Juan. Trajectory-Clustering Based Privacy Protection Method for Continuous Query in LBS [J]. Computer Engineering and Applications, 2021, 57(2): 104-112.
[2]	SHI Chen, ZHANG Yu, HU Bo. Model for Near-Synonym/Synonym Phrase Finding Based on Common Surrounding Context [J]. Computer Engineering and Applications, 2021, 57(14): 142-147.
[3]	XU Bin, LIANG Xiaobing, SHEN Bo. Non-interactive Queries Differential Privacy Protection Model in Big Data Environment [J]. Computer Engineering and Applications, 2020, 56(7): 116-121.
[4]	GUO Shasha, LI Shuang, YAN Hongcan. Time-Aware Spatial-Textual Skyline Query [J]. Computer Engineering and Applications, 2020, 56(24): 59-65.
[5]	HUANG Taoyi, LI You, SONG Hao, LIN Yuming. Organization and Query Optimization of Large-Scale Product Knowledge [J]. Computer Engineering and Applications, 2020, 56(21): 154-163.
[6]	ZHANG Xiao, SUN Yiming, WU Xufeng. Research on Query-Aware Relation-Graph Database Adaptive Storage Technology [J]. Computer Engineering and Applications, 2020, 56(17): 100-108.
[7]	CAI Pan, LI Xin, MENG Xiangfu, CHU Zhiguang. Greedy Strategy Based Nearest Neighbor Top-[k] Preference Query Method [J]. Computer Engineering and Applications, 2020, 56(16): 55-61.
[8]	LI Yan, WANG Yangyang, ZHANG Hongyan, WU Youxi. Fast Pruning Algorithm for Unreachable Vertices and Its Application in Solving Shortest Path Problem [J]. Computer Engineering and Applications, 2020, 56(15): 51-57.
[9]	WANG Yonglu, ZUO Kaizhong, ZENG Haiyan, LIU Rui, GUO Liangmin. Sensitive-Semantic Location Privacy Protection for Continuous Query [J]. Computer Engineering and Applications, 2020, 56(14): 74-81.
[10]	ZHENG Wei, HOU Hongxu, BAN Zhijie. Expert Finding Method Using Baysian Network on Query Semantic Extension [J]. Computer Engineering and Applications, 2020, 56(13): 194-198.
[11]	TENG Zengde, LIAO Zhuhua. Differentiated Service Mechanism for Data Query on Named Data Networking [J]. Computer Engineering and Applications, 2019, 55(9): 17-25.
[12]	LV Wei, SONG Wenai, FU Lizhen, XU Wen. Shortest Distance Query Algorithm for Large-Scale Edge Restricted Graph Data [J]. Computer Engineering and Applications, 2019, 55(7): 71-81.
[13]	GUAN Wei1, ZHANG Lei2. PSO Attributes Clustering for Preserving Location Privacy [J]. Computer Engineering and Applications, 2019, 55(5): 96-104.
[14]	WANG Hongji, DAI Bingrong, LI Chao, ZHANG Shaohua. Query Optimization Model for Blockchain Applications [J]. Computer Engineering and Applications, 2019, 55(22): 34-39.
[15]	NIU Shaozhang, OU Yuyi, LING Jie, GU Guosheng. Local Outlier Detection Algorithm Based on Grid Query [J]. Computer Engineering and Applications, 2019, 55(17): 89-94.

Querying on evolving graphs based on compressed full-text index

基于压缩全文索引的演变图查询

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics