计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (30): 5-10.

• 博士论坛 • 上一篇    下一篇

面向网络数据管理的并行查询处理

王 勇1,2,3,焦丽梅1,2   

  1. 1.国家智能计算机研究开发中心,北京 100080
    2.中国科学院 计算技术研究所,北京 100080
    3.中国科学院 研究生院,北京 100039
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-10-21 发布日期:2007-10-21
  • 通讯作者: 王 勇

Parallel query processing oriented to network data management

WANG Yong1,2,3,JIAO Li-mei1,2   

  1. 1.National Research Center for Intelligent Computing System,Beijing 100080,China
    2.Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100080,China
    3.Graduate School of Chinese Academy of Science,Beijing 100039,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-10-21 Published:2007-10-21
  • Contact: WANG Yong

摘要: 随着互联网的迅猛发展,监控网络的所产生的海量数据对查询处理提出挑战。根据数据明显分为大量的事件数据和少量、稳定的配置数据的特点,提出了一种基于单机DBMS的并行查询处理方法。从关系代数的角度,将任意查询分解成对水平数据分区的子查询和汇总中间结果的后处理查询。借助DBMS提供的数据库链路,在不改动DBMS的情况下,方便地构造查询处理器。用真实负载的测试表明:在中间结果集不很大的情况下,能获得接近线性的扩展比。

Abstract: With the rapid expansion of Internet,massive data produced by monitoring networks present challenge to query processing.According to the application characteristics that data can be divided into two classes:huge event data and smaller,stable configure data,presents an approach of parallel query processing based on the DBMS on single machine.From the perspective of relational algebra,decomposes any query to the sub-query on data partitioned horizontally and the post-query on merged intermediate results.With the database link provided by the DBMS,we can construct the query processor easily without any change to the original DBMS.Experimental results on real workload show that near-linear scalability can be achieved if the size of result is not very huge.